Scrolling for the Truth: Banerjee Builds AI Tool to Verify Scientific Claims on Social Media

Stony Brook, NY, February 22, 2026 — Scroll through Instagram or X long enough, and you’ll see it — a reel insisting that “fruits are citrus, so you shouldn’t eat them with milk,” a thread warning “protein shakes wreck kidney function,” a carousel promising “this workout routine will fix your PCOS in 30 days.” Every third post seems to offer a health hack, often backed by a chart, a DOI link, and just enough scientific language to sound convincing.

But behind those posts is a tangle of dense scientific research that few people ever read.

“How do we get from what scientists actually know, to what people are saying online?” asked Ritwik Banerjee, research assistant professor of computer science at Stony Brook University. “Right now, that bridge is very noisy. Claims slip, get simplified, or just made up.” Yet most of us never click through to read the original paper. We assume the citation is honest, especially when it confirms what we want to believe. Banerjee said, “We need reliable tools to trace what’s real.”

Ritwik Banerjee
Ritwik Banerjee

Banerjee, working with graduate student Parth Manish Thapliyal, undergraduate students Ritesh Sunil Chavan and Samridh Samridh, and Dr. Chaoyuan Zuo at Nankai University, developed a solution in response to the challenge called CheckThat! 2025, organized under the CLEF conference held in Madrid, Spain. Their research attempts to bridge this information gap using AI: when a social media post invokes “science,” can we design an automated system to figure out which research article, if any, actually matches the claim?

To do that, Banerjee’s team had to work through several layers of modern health communication. At one end are the posts themselves: short, informal messages about vaccines, diet, exercise, COVID, fertility treatments, and more. In the middle are blog posts, news stories, and explainer threads that try to summarize medical research for the public. At the far end are scientific articles: dense, technical papers, written by and for specialists, rather than casual scrollers.

“The information source is in the scientific publication world, and the claim is in the online discourse world,” Banerjee said. “Our job was to connect the dots between these two.”

Instead of throwing one large model at the whole problem, the system his group built breaks that task into stages. First, it decides whether a post is even part of a scientific conversation. “We had to ask: Is there a concrete, check-worthy health claim here? Is there a reference to a study or a researcher?” he said. “A lot of posts talk about wellness and medicine without saying anything you could actually verify.”

To make that judgment, the team trained what computer scientists call ‘classifiers’ — AI models that sort things into categories — to label each post along several categories. Does it contain a specific health claim that sounds testable? Does it mention a study, a journal, or a researcher by name? Does it at least gesture toward scientific evidence, or is this purely personal experience? The goal at this stage is not to decide whether a claim is true or false, but simply to identify which posts hold enough weight to further investigate. Incidentally, this portion of the work builds on earlier research done by Banerjee and Zuo, back in 2018-22, when the latter was a Ph.D. student at Stony Brook.

Once a post is tagged as scientifically relevant, the system reads it more closely. Names of drugs, diseases, journals, universities, and years are pulled out as anchors. A post that talks about “a 2021 Lancet study on Long COVID in healthcare workers,” for example, gives the system far more to work with than one that just says “new research shows supplements detox your liver” with no more detail.

“Those identifiers become very strong hints,” Banerjee said. “If a post mentions a specific journal, a hospital, or a lab, that immediately narrows down the space of possible articles on the scientific literature side.”

The final step is to search through databases for the relevant paper, or set of papers, that best line up with the claim. Here, Banerjee’s group leans on a mix of older and newer AI techniques. The system starts with fast, traditional search — the kind of keyword-based retrieval that has powered search engines for decades — to pull out a manageable list of candidate articles from millions of possibilities. That stage favors papers whose titles and abstracts have important terms common with the post and its extracted clues.

Then a slower, more sophisticated language model compares each candidate paper with the post’s claim, scoring how well they match and re-ranking the list. It’s important to note that this model does not generate any new text. It doesn’t write explanations, judgments, or health advice. It only shuffles existing articles into a better order.

“That choice makes the system easier to evaluate and, ultimately, to trust,” Banerjee said. “Either we identified the right paper, or we didn’t. There’s no polished summary to hide the fact that the citation doesn’t really fit.”

On CheckThat! 2025 benchmarks, Banerjee’s system excelled (with 90% accuracy) at the early stages of the job. Compared with other teams, it was particularly strong at telling which posts contained scientific health claims at all, and at spotting mentions of researchers, institutions, and journals. It also did well in recognizing when posts were genuinely anchored in medical research rather than just using scientific language as decoration.

The hardest part, as expected, was the final jump from post to paper (70% accuracy). Even with all those safeguards, reliably picking out the exact article a post should be citing remains challenging. “It’s one thing to find a paper on hypertension or vaccine side effects,” Banerjee said. “It’s another thing to find the one the author had in mind, especially when the post is unintentionally vague or a bit careless with the details.”

For him, that difficulty is part of the point. If a purpose-built AI system, with access to full-text databases and carefully designed models, struggles to line up claims with their source papers, it suggests that many of us are navigating health information with less support than we think.

“In earlier projects, we manually followed how a medical result changed as it moved from a paper to a news story to a social media post,” he said. “You could see where caveats were dropped, where numbers were rounded, where language became more absolute. Now we’re trying to use AI to map that process at scale, to support human judgement, to show where misunderstandings are likely.”

Looking ahead, Banerjee sees several directions for this work. Many health posts embed screenshots of tables, charts, or PDF pages rather than clean links, while others mix text with short videos, voiceovers, and image memes. Expanding systems like this beyond plain text will be necessary if they are to reflect how people actually consume medical information online.

At the Social & Computational Intelligence Research (SCIRE) group, Banerjee and his students are exploring new ways of adapting smaller and lighter models to broader domains of healthcare, from nutrition myths to chronic disease management. “We developed a really interesting approach to train classifier models, called class distillation”, he said, “where models hundreds of times smaller, with less training data, were just as good as the behemoths in use today. We are currently working on adapting this approach to re-rank medical information.”

Just as important, he says, is thinking carefully about how such tools are used. A future version of this system might power a “see the studies” button next to viral health claims, surfacing relevant papers and pointing out when no good match exists — without stamping posts as true or false or telling readers what to believe.

“I don’t think AI will settle every argument about diets, vaccines, or treatments,” Banerjee said. “But if we can make it easier to see where a claim really comes from — or whether it comes from anywhere at all — that already changes the conversation.”

He pauses, then adds: “Technology will keep shifting. What we’re trying to improve is the connection between the health advice we see every day and the science that’s supposed to back it up. That, is a gap worth closing, especially when it comes to people’s health.”

News Author

Ankita Nagpal