In new research Published in Harvard Kennedy School Misinformation Review, researchers from Borås University, Lund University, and the Swedish University of Agricultural Sciences found a total of 139 papers suspected of exploiting ChatGPT or similar large-scale language modeling applications. Of these, 19 were published in indexed journals, 89 were published in non-indexed journals, 19 were student papers in university databases, and 12 were research papers (mostly in preprint databases). Health and environment papers accounted for approximately 34% of the sample, with 66% of them published in unindexed journals.
Using ChatGPT to generate text for academic papers has raised concerns about research integrity.
Discussion about this phenomenon is ongoing in editorials, commentaries, opinion pieces, and social media.
There are currently several lists of papers suspected of exploiting GPT, and new papers are being added all the time.
Although there are many legitimate uses of GPT for research and academic writing, its undeclared uses beyond proofreading may have far-reaching implications for both science and society, especially the relationship between the two.
“One of the main concerns about AI-generated research is the increased risk of evidence hacking, meaning that fake research could be used for strategic manipulation,” said Björn Ekström, a researcher at the University of Boras.
“This could have a tangible impact, as erroneous results could penetrate further into society and into more areas.”
In their research, Dr. Ekström and his colleagues searched and scraped Google Scholar for papers containing specific phrases known as common responses from ChatGPT and similar applications with the same underlying model. Unable to access real-time data.
This facilitated the identification of papers whose text may have been generated using generative AI, resulting in a search of 227 papers.
Of these papers, 88 papers were written with legal and/or declared uses of GPT, and 139 papers were written with undeclared and/or fraudulent uses.
The majority of problematic papers (57%) dealt with policy-relevant subjects that are likely to impact operations (i.e., environment, health, computing).
Most were available in multiple copies on different domains (social media, archives, repositories, etc.).
Professor Jutta Haider from Borås University said: “If we cannot trust that the studies we read are genuine, we run the risk of making decisions based on misinformation.”
“But this is as much a media and information literacy issue as it is a scientific misconduct issue.”
“Google Scholar is not an academic database,” she pointed out.
“Search engines are easy to use and fast, but they lack quality assurance procedures.”
“This is already a problem with regular Google search results, but it becomes even more of a problem when making science more accessible.”
“People's ability to decide which journals and publishers publish high-quality, reviewed research is critical to finding and determining what is trustworthy research, and is important for decision-making and opinion. It is very important for formation.”
_____
Jutta Haider others. 2024. GPT Fabricated Scientific Papers on Google Scholar: Key Features, Pervasiveness, and Impact on Preemptive Attacks of Evidence Manipulation. Harvard Kennedy School Misinformation Review 5(5);doi: 10.37016/mr-2020-156
Source: www.sci.news