We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


AI in Science Publication: The Good, the Bad and the Questionable

A picture that reads "AI et al."
Credit: Technology Networks.
Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 8 minutes

AI in scientific publishing – new opportunities and new dilemmas

When AI-assisted technologies became increasingly mainstream early last year, they were met with both excitement and concern over their potential to reduce the burden of mundane tasks as well as their possible misuse. Particularly in academic settings, and especially in academic publishing.

The global sharing of scientific research progress is integral to the existence of science and its ability to improve our lives and our planet. Through the current publishing model, the journey from research data generation to manuscript publication presents many opportunities where AI could, hypothetically, be used – for better or for worse.

Dmytro Shevchenko, Aimprosoft's lead data scientist, and PhD student in computer science at Kharkiv National University in Ukraine, has five years’ worth of experience working with commercial large language models (LLMs). He believes that there are many beneficial applications of generative AI (GAI) in publishing: “Creating abstracts and summaries is one example. LLMs can generate summaries of research papers, which can help readers quickly understand the main findings and implications of the research.”

Shevchenko also sees LLMs as having a positive impact on the accessibility and reach of research findings, given that they could facilitate the translation of research articles into different languages. “Text checking and correction is another benefit. LLMs are trained on large datasets and can generate coherent and grammatically correct text. This can help improve the overall quality of research papers by making articles more readable and understandable,” he adds.

“I think AI is a fantastic tool to streamline and speed up the publishing process,” echoes Dr. Andrew (Andy) Stapleton, a former research chemist. Stapleton now works as a content creator developing resources, training and products that are helpful for academics. “So much of the boring and procedural can be written faster (abstracts, literature reviews, summaries and keywords etc.).”

In early 2023, it seemed as though many scientific publishers did not share Shevchenko and Stapleton’s enthusiasm for the practical applications of AI. Some limited how the tools could be adopted during manuscript preparation, while others, like Science, took an even more restrictive stance, banning their use entirely.

“The scientific record is ultimately one of the human endeavor of struggling with important questions. Machines play an important role, but as tools for the people posing the hypotheses, designing the experiments and making sense of the results. Ultimately the product must come from – and be expressed by – the wonderful computer in our heads,” Herbert Holden Thorp, the Science journals editor-in-chief, said in January 2023.

In Stapleton’s opinion, this decision was underpinned by a fear of change. He thinks that the technology moved faster than journals were able to assess best practices. Perhaps this was a motivating factor for the outright ban – but there are very real hazards posed by the use of AI in scientific research and publication.  

AI policies in scientific publishing

The possibility that AI tools could “supercharge” paper mill systems is explored in Gianluca Grimaldi and Bruno Ehrler’s AI et al.: Machines Are About To Change Scientific Publishing Forever. Paper mill systems, where organizations produce and sell poor or fake journal papers, are just one of the unfortunate consequences of the publish or perish paradigm.

“A text-generation system combining speed of implementation with eloquent and structured language could enable a leap forward for the serialized production of scientific-looking papers devoid of scientific content, increasing the throughput of paper factories and making detection of fake research more time-consuming,” Grimaldi and Ehrler say. Paper mills are already abundant across the globe, and the authors fear that the situation will only worsen with the influx of AI-assisted tools.

Another concern is that, while LLMs can generate text, they can’t always produce accurate or scientifically valid content. “In most cases, LLMs may lack a proper understanding of scientific concepts and context. While they may generate text based on statistical patterns in the training data, they do not understand the meaning of the words or scientific concepts about which they are generating a particular text,” Shevchenko explains, emphasizing that researchers must carefully review and verify the text generated by LLMs to ensure its accuracy and validity.

Fabrication of data is also a potential problem. A recent Nature study used ChatGPT-3.5 and ChatGPT-4 to create short literature reviews on 42 different topics using 84 papers. The researchers found that 18–55% of references generated using ChatGPT-4 and ChatGPT-3.5 were fabricated, respectively. Discussing the work in a BMJ editorial piece, Drs. Nazrul Islam, associate professor of epidemiology and medical statistics, and Mihaela van der Schaar, the John

Humphrey Plummer Professor of machine learning, artificial intelligence and medicine at the University of Cambridge, said, “This issue is particularly pressing with the proliferation in conspiracy theories, misinformation, disinformation and skepticism towards scientific consensus, such as the antivax movement.”

“There are also ethical issues associated with using LLMs in scientific publications, including issues of plagiarism, attribution and intellectual property rights,” Shevchenko says. “Researchers must ensure that appropriate information is provided to sources and collaborators and that ethical principles of scholarly publishing are adhered to.”

“Over-reliance on AI could reduce the role of critical human judgment in scientific inquiry, potentially overlooking novel or unconventional insights that AI algorithms might miss,” Stapleton says.

Addressing such drawbacks while AI tools develop at a rapid pace is no easy feat. In Research Ethics, Hosseini et al. suggest that the most reasonable response is to “develop policies that promote transparency, accountability, fair allocation of credit and integrity.”

Are these policies working?

Ultimately, Science changed its position on AI-assisted technologies late last year. “They quickly did a 180 when they realized that they didn't have the manpower or time to be able to police that properly. So, they – very rightly – changed that, and authors can now put a small declaration of how AI was used,” Stapleton says.

Several other leading journals adopted a similar approach, whereby most require a declaration of AI being used, and firmly oppose its application to generate or alter research images.

JAMA requests that the details provided are granular – researchers must cite the name of any AI software platform, program or tool used, the version and extension number, manufacturer and date(s) of its use.

Springer Nature has specific policies for peer reviewers, requesting that, until peer reviewers are provided with safe AI tools, they “do not upload manuscripts into generative AI tools.”

Elsevier’s policies state that, where AI and AI-assisted technologies are used in the writing process, it should be done so only to improve readability and language of the work. This policy came under scrutiny recently when a peer-reviewed paper, featuring an all-too-familiar introductory sentence, “Certainly, here is a possible introduction for your topic,” was published by the journal. The paper went viral on X, leading Elsevier to respond with, “Our policies are clear that LLMs can be used in the drafting of papers as long as it is declared by the authors on submission. We are investigating this paper and are in discussion with [the] editorial team and the authors.”

This was not an isolated incident; the X thread features members of the scientific community flagging various examples of AI-generated text labels in peer-reviewed research from different journals, which begs the question: are these guidelines clear, are they being enforced and are they working?

A robust framework for GAI in scientific publishing

Ganjavi et al. explored the extent and content of the top 100 academic publishers’ and scientific journals’ guidance for authors on the use of GAI. Updated in October of last year, the research found that 24% of publishers provided guidance. Only 15% of these publishers were among the top 25 publishers analyzed. Ultimately the authors concluded that guidelines by some of the top publishers are “lacking”.

“Among those that provided guidelines, the allowable uses of GAI and how it should be disclosed varied substantially, with this heterogeneity persisting in some instances among affiliated publishers and journals,” they write.

Heterogeneity causes confusion surrounding the “dos and don’ts” and could lead to discrepancies in the standard of work published in different journals. So, what’s the solution? Islam and van der Schaar suggest that a multifaceted approach, as paraphrased below, is required:

  1. Comprehensive guidelines that outline the acceptable use of GAI in research need to be developed and implemented.
  2. Medical journals must implement peer review processes, tailored to identify and scrutinize AI-generated content to safeguard against inaccuracies and ethical lapses.
  3. Collaboration between clinical scientists, editorial boards, AI developers and researchers to understand both the capabilities and limitations of the tools.
  4. A robust framework for transparency and accountability, where researchers using AI know how to disclose exactly how it was used in their work.
  5. Ongoing research that explores the impacts of AI on scientific integrity.

Progress is on the horizon, at least when it comes to the development of robust frameworks. Last year, Giovanni Cacciamani, associate professor of urology and radiology research at Keck School of Medicine, and colleagues established the "ChatGPT and Generative Artificial Intelligence Natural Large Language Models for Accountable Reporting and Use" (CANGARU) project, the protocol for which is currently available as a preprint on arXiv.*

The CANGARU project will help guide scientists on how they can ethically use and disclose the use of GAI in their work. It consists of four parts:

  • An ongoing systematic review of GAI/GPT/LLM applications to understand the linked ideas, findings and reporting standards in scholarly research, and to formulate guidelines for its use and disclosure.
  • A bibliometric analysis of existing author guidelines in journals that mention GAI/GPT/LLM, with the goal of evaluating existing guidelines, analyzing the disparity in their recommendations and identifying common rules that can be brought into the Delphi consensus process.
  • A Delphi survey to establish agreement on the items for the guidelines, ensuring principled GAI/GPT/LLM use, disclosure and reporting in academia.
  • The subsequent development and dissemination of the finalized guidelines and their supplementary explanation and elaboration documents.

“This global, cross-disciplinary project seeks to develop a universally inclusive set of consensus guidelines for the academic use of generative AI, leveraging the Delphi Consensus model. It has garnered the participation of over 3,000 academics from diverse fields worldwide, making it, to our knowledge, one of the largest and most inclusive Delphi consensus efforts in academia,” Cacciamani
says. “We are eager to share the results of the CANGARU initiative and appreciate the support from academics and scientists around the world in this effort to protect the future of academia by ensuring the ethical and proper use of GAI going forward.”

A “broken” system?

Robust guidelines are the first step to navigating what will be a long, winding road of adaptation to AI-assisted technologies. Though challenging, addressing these issues might encourage confrontation of some of the deeper-rooted issues in academic publishing, Stapleton says: “In my opinion, AI has just highlighted the issues with an already broken and gamified academic system.”

“It is the breaking point where we now need to address the underlying issues of the publish or perish culture that has eaten away the foundations of academia. It’s easy to blame AI as the cause of the issues, when in fact it is the magnifier that shows just how bad things have gotten,” he concludes.

About the interviewees

Andrew (Andy) Stapleton
Andy Stapleton is an educator, entrepreneur and research chemist. After completing his PhD in chemistry from the University of Newcastle in Australia in 2011, and spending many years in both academia and in industry, Andy shifted gears to focus on science communication and academic advising. You can find Andy on his YouTube channel “Andy Stapleton”, where he discusses everything about graduate school and provides excellent tips, advice and insights into life in academia, how to prepare and tools for surviving it.

Dmytro Shevchenko

Dmytro Shevchenko is a PhD student in computer science and a data scientist at Aimprosoft. He has five years of commercial experience with LLMs.

*This article is based on research findings that are yet to be peer-reviewed. Results are therefore regarded as preliminary and should be interpreted as such. Find out about the role of the peer review process in research here. For further information, please contact the cited source.