Team teaches AI models to spot misleading scientific reporting

The dataset construction process: 1) utilizing publicly available datasets as well as web resources to collect human-written scientific news related to COVID-19 (Subsection), 2) selecting abstracts from CORD-19 as resources to guide LLMs to generate articles using jailbreak prompt (Subsection ), 3) the dataset is augmented with evidence corpus drawn from CORD-19 (Subsection). Credit:

Artificial intelligence isn’t always a reliable source of information: large language models (LLMs) like Llama and ChatGPT can be prone to “hallucinating” and inventing bogus facts. But what if AI could be used to detect mistaken or distorted claims, and help people find their way more confidently through a sea of potential distortions online and elsewhere?

As presented at a workshop at the annual conference of the Association for the Advancement of Artificial Intelligence, researchers at Stevens Institute of Technology present an AI architecture designed to do just that, using open-source LLMs and free versions of commercial LLMs to identify potential misleading narratives in news reports on scientific discoveries.

“Inaccurate information is a big deal, especially when it comes to scientific content—we hear all the time from doctors who worry about their patients reading things online that aren’t accurate, for instance,” said K.P. Subbalakshmi, the paper’s co-author and a professor in the Department of Electrical and Computer Engineering at Stevens.

“We wanted to automate the process of flagging misleading claims and use AI to give people a better understanding of the underlying facts.”

To achieve that, the team of two Ph.D. students and two Master’s students led by Subbalakshmi first created a dataset of 2,400 news reports on scientific breakthroughs.

The dataset included both human-generated reports, drawn either from reputable science journals or low-quality sources known to publish fake news, and AI-generated reports, of which half were reliable and half contained inaccuracies.

Each report was then paired with original research abstracts related to the technical topic, enabling the team to check each report for scientific accuracy. Their work is the first attempt at systematically directing LLMs to detect inaccuracies in science reporting in public media, according to Subbalakshmi.

“Creating this dataset is an important contribution in its own right, since most existing datasets typically do not include information that can be used to test systems developed to detect inaccuracies ‘in the wild,'” Dr. Subbalakshmi said. “These are difficult topics to investigate, so we hope this will be a useful resource for other researchers.”

Next, the team created three LLM-based architectures to guide an LLM through the process of determining a news report’s accuracy. One of these architectures is a three-step process. First, the AI model summarized each news report and identified the salient features.

Next, it conducted sentence-level comparisons between claims made in the summary and evidence contained in the original peer-reviewed research. Finally, the LLM made a determination as to whether the report accurately reflected the original research.

The team also defined “dimensions of validity” and asked the LLM to think about these five “dimensions of validity”—specific mistakes, such as oversimplification or confusing causation and correlation, commonly present in inaccurate news reports.

“We found that asking the LLM to use these dimensions of validity made quite a big difference to the overall accuracy,” Dr. Subbalakshmi said and added that these dimensions of validity can be expanded upon, to better capture domain-specific inaccuracies, if needed.

Using the new dataset, the team’s LLM pipelines were able to correctly distinguish between reliable and unreliable news reports with about 75% accuracy—but proved markedly better at identifying inaccuracies in human-generated content than in AI-generated reports. The reasons for that aren’t yet clear, although Dr. Subbalakshmi notes that non-expert humans similarly struggle to identify technical errors in AI-generated text.

“There’s certainly room for improvement in our architecture,” Dr. Subbalakshmi says. “The next step might be to create custom AI models for specific research topics, so they can ‘think’ more like human scientists.”

In the long run, the team’s research could open the door to browser plugins that automatically flag inaccurate content as people use the Internet, or to rankings of publishers based on how accurately they cover scientific discoveries.

Perhaps most importantly, Dr. Subbalakshmi says, the research could also enable the creation of LLM models that describe scientific information more accurately, and that are less prone to confabulating when describing scientific research.

“Artificial intelligence is here—we can’t put the genie back in the bottle,” Dr. Subbalakshmi said. “But by studying how AI ‘thinks’ about science, we can start to build more reliable tools—and perhaps help humans to spot unscientific claims more easily, too.”

More information:
Yupeng Cao et al, CoSMis: A Hybrid Human-LLM COVID Related Scientific Misinformation Dataset and LLM pipelines for Detecting Scientific Misinformation in the Wild. openreview.net/pdf/17a3c9632a6 … f245c9dce44cf559.pdf

Provided by
Stevens Institute of Technology

Citation:
Team teaches AI models to spot misleading scientific reporting (2025, May 29)
retrieved 29 May 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Source link

US government report cited non-existent sources, academics say

Olorato Mongale: Suspect in South Africa student’s murder killed in police shootout

How the West is helping Russia to fund its war on Ukraine

Plane carrying Liberian president involved in landing scare

Gaza being subjected to forced starvation, top UN official tells BBC

Joseph Boakai: Plane carrying Liberian president involved in landing scare

‘Éclaircissement’: Watch 13-year-old win US National Spelling Bee

Polish knife-edge presidential vote pits liberal mayor against conservative

Rescued giant moths emerge from cocoons in Mexico’s sprawling capital

World has responsibility to get aid into Gaza, UN official tells BBC

IGET.NEWS

Oh hi there
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every week.

More From Author

Space power satellites at the moon could keep a lunar base warm

Crypto Whales Are Stacking These Coins for Potential Gains in June 2025

US government report cited non-existent sources, academics say

Ohtani scores 170 runs? Royals have a top-10 SP? Don’t be surprised

NWSL newcomers Denver hires Johnson as GM

Leave a Reply Cancel reply

Recent News

About Us - Terms of use - Privace Policy

Oh hi there It’s nice to meet you.

Sign up to receive awesome content in your inbox, every week.

Leave a Reply Cancel reply

Recent News

Top of the day

About Us - Terms of use - Privace Policy

Oh hi there
It’s nice to meet you.