Image by Ryunosuke Kikuno, from Unsplash

Studies Show ChatGPT and Other AI Tools Cite Retracted Research

Reading time: 2 min

Last Updated: Sep 24, 2025

Written by Kiara Fabbri Former Tech News Writer
Fact-Checked by Sarah Frazier Former Content Manager

Some artificial intelligence chatbots are giving answers based on flawed research from retracted scientific papers, recent studies show.

In a rush? Here are the quick facts:

AI chatbots sometimes cite retracted scientific papers without warning users.
ChatGPT GPT-4o referenced retracted papers five times, warning in only three.
Experts warn retraction data is inconsistent and often hard for AI to track.

The research findings, which MIT Technology Review confirmed, raise doubts about AI reliability when it comes to answering scientific questions to researchers, students and the general public.

AI chatbots are already known to sometimes fabricate references. But experts warn that even when the sources are real, problems arise if the papers themselves have been pulled from the scientific record.

“The chatbot is ‘using a real paper, real material, to tell you something,’” says Weikuan Gu, a medical researcher at the University of Tennessee, as reported by MIT. “But, he says, if people only look at the content of the answer and do not click through to the paper and see that it’s been retracted, that’s really a problem,” he added.

MIT reports that Gu’s team tested ChatGPT running on OpenAI’s GPT-4o model with 21 retracted medical imaging papers. The chatbot referenced retracted sources five times yet it only warned users about this issue in three of those instances. Another study found similar issues with GPT-4o mini, which failed to mention retractions at all.

The problem extends beyond ChatGPT. MIT evaluated research-oriented AI tools by testing Elicit, Ai2 ScholarQA, Perplexity, and Consensus. Each cited studies which had been retracted and did not warn about this. The researchers said this happened multiple times in dozens of cases. Some companies say they are now improving detection.

“Until recently, we didn’t have great retraction data in our search engine,” said Christian Salem, cofounder of Consensus, which has since added new sources to reduce errors.

Experts argue that retraction data is patchy and inconsistent. “Where things are retracted, they can be marked as such in very different ways,” says Caitlin Bakker from the University of Regina.

Researchers warn users to stay cautious. “We are at the very, very early stages, and essentially you have to be skeptical,” says Aaron Tay of Singapore Management University.

Studies Show ChatGPT and Other AI Tools Cite Retracted Research

We're thrilled you enjoyed our work!