Ethical Use of ChatGPT in Scientific Writing: A Call for Transparent Guidelines and Uniform Editorial Policies
Author(s):
Madeena Sultana
Defence Research and Development Canada
Defence Scientist
Disclaimer: The French version of this editorial has been auto-translated and has not been approved by the author.
ChatGPT, powered by advanced natural language processing (NLP) algorithms, has emerged as a powerful tool for various applications, including scientific writing. It is an Artificial Intelligence (AI)-based generative language model that can engage in conversational exchanges, generating coherent and contextually relevant responses. Within just two months of its launch, ChatGPT has already acquired over 100 million users worldwide. Following the trend, scientists and researchers have also begun using ChatGPT for scientific writing. However, unlike other tools, ChatGPT’s contribution to scientific writing can be as significant as a co-author. During this undecided time, ChatGPT has even been attributed as a co-author of accepted articles. This raises questions such as whether ChatGPT, as an AI tool, can be held accountable like other human authors for everything presented in a scientific paper. If not, how can researchers and scientists leverage such a powerful tool to boost their productivity within the ethical boundaries? This editorial explores the importance of establishing standard guidelines, use cases, and policies for utilizing generative AI models like ChatGPT in scientific writing to ensure its responsible and effective use.
What is So Exciting About ChatGPT and AI models for Scientific Writing?
ChatGPT presents several advantages that can significantly benefit scientific writing. It can offer a substantial productivity boost by automating certain aspects of the writing process, such as generating initial drafts, summarizing research findings, or suggesting relevant literature. This time-saving potential can help researchers focus on higher-level tasks and accelerate the dissemination of scientific knowledge. ChatGPT can also enhance equity by removing language barriers for non-native speakers of a particular language, offering support for editing, grammatical correction, and coherent writing. Moreover, this AI tool allows researchers to tailor their writing for different audiences, ranging from scientific experts to policymakers or the general public. It can assist in effectively communicating complex concepts, making scientific knowledge accessible to broader audiences. By bridging the gap between scientific research and society, ChatGPT can promote informed decision-making and public engagement with science. Some commercial tools based on language models like ChatGPT are emerging to assist researchers in conducting literature reviews, boosting the workflow of this tedious job. On top of that, AI tools like ChatGPT are great at plotting graphs in seconds that may take several hours for a researcher to create using existing tools. Recently, researchers, who published their work in Nature Human Behavior, have shown that AI models can accelerate scientific discoveries and complement humans’ blind spots by generating hypotheses that are unlikely to be thought of by humans until the distant future. As AI models continue to advance, it is expected that AI-powered scientific processes, spanning from forming hypotheses to generating plots and writing, will become undeniable reality in the near future. .
What Are the Risks?
While the potential benefits of using ChatGPT in scientific writing are substantial, it is essential to acknowledge the associated risks. One such risk is hallucinations, where ChatGPT may generate inaccurate or fictional information. These hallucinations can lead to the inclusion of misleading or false statements in scientific articles, undermining the integrity of research and knowledge. Another critical risk is the challenge of alignment, where the AI system might not accurately capture the author’s intent or faithfully reflect the scientific evidence. This misalignment can result in distorted interpretations, leading to unintended biases, inaccuracies, or misrepresentations within the generated text. Furthermore, data leakage poses a significant concern. As ChatGPT is trained on large datasets, including copyrighted or proprietary information, there is a risk of inadvertently including plagiarized content in the generated text. This can compromise the originality and integrity of scientific writing, potentially leading to legal and ethical issues. Additionally, the possibility of data leakage and data storage in third-party servers may pose a significant risk to research security, where scientific data needs to be kept out of the reach of bad actors. The fairness and biases of ChatGPT also demand attention. If the AI model is not adequately fine-tuned or trained on diverse datasets, it may inadvertently perpetuate biases in the generated text, reinforcing existing inequalities or prejudices. Some other significant concerns published in a recent Nature article include the potential for writing fake scientific articles with fictitious experimental results, as well as p-hacking, where scientists choose to publish only favorable hypothesis test results.
Which Path to Take: Banning, Restricting, or Adaptation?
The risks associated with using ChatGPT in scientific writing can have profound implications for scientific accountability, credibility, and integrity. To address these concerns, it is crucial to develop explicit guidelines and standard Editorial Policies for utilizing ChatGPT-like tools in scientific writing. So far, the scientific community is swinging between banning, restricting, and adopting the use of AI generated text, data, and images in research papers. For example, the editorial policy of science journals strictly bans use of AI generated text, figure, image, and graphics without explicit permission from the Editor. JAMA discourages submission of “reproduced and re-created” contents by AI, language models, machine learning, or similar technologies unless it is a part of the formal research design and method. In case of permitted use, JAMA requires a clear description of the created content, name and version number with extension of the model or tool, and manufacturer. Nature on the other hand seems to have taken a bit more of an adaptive approach. Nature’s guidelines state that the use of language models like ChatGPT should be properly documented in the Methods section or, if unavailable, in an alternate suitable part. One clear and unanimous guideline for all the big publishers is not to credit ChatGPT as an author of a research paper. The whole research community would agree upon this, as ChatGPT or similar systems are not yet capable of taking on the accountability and responsibilities of human authors. The Editors of Accountability in Research recently published an editorial proposing a new policy regarding the use of AI-generated text in research papers aiming to ensure transparency and accountability. The draft policy proposal took a much more practical standpoint for both acknowledging the use of AI tools at different stages of the research development such as literature review, synthesizing ideas, text-content generation, etc., and disclosing the use in a more structured manner with examples. They suggested submission of a supplementary material containing the contents generated by the NLP system along with appropriate disclosure in the methodology section. This is an appreciative initiative paving the way of ethical use of AI tools in scientific writing.
AI is Here to Stay, We Need Adaptation
The scientific community is increasingly leaning towards the adaptation of AI tools in scientific writing rather than banning them outright. However, the guidelines for the proper disclosure of AI-generated content in scientific writing remain insufficient for effective implementation. From an author’s perspective, the following elements should be considered to developing clear and consistent guidelines for using AI tools in scientific writing:
- Uniform Terminologies: Different publishers have interchangeably used terms like “AI-generated” and “re-generated”. Tools like ChatGPT can both generate content from scratch and regenerate content by editing according to user instructions. It’s essential to distinguish between AI “re-generated” and AI “generated” content, using consistent terminology for both.
- Documentation and Disclosure guidelines: Any use of models like ChatGPT in scientific writing needs to be disclosed. The current guidelines, however, do not adequately address how the AI’s contributions should be documented, in which format, and to what extent. The following are to be considered to enhance the structure and clarity of the disclosure process:
- Documentation Use-case: It is essential to provide use-cases that clarify required documentation. Different levels of AI usage, such as significant content generation versus minor content editing, should have varying documentation requirements.
- Documentation Template: Using ChatGPT is still an art, involving prompt engineering and iterative processes. Free-form documentation of such a process is not only challenging but might also be counterproductive. A structured documentation template could reduce ambiguity, ensuring consistency and transparency.
- Disclosure Statement Appearance: Different publishers have varying requirements for AI contribution declarations. For example, JAMA requires a declaration in the acknowledgment or method section (as appropriate), Nature in the Methods or an alternate section, IEEE in the acknowledgment section, Accountability in Research in the methodology section with supplementary material submission, and Elsevier in a separate section above the references. A consensus should be reached about where these contributions are disclosed to provide authors with consistent guidelines.
- Reviewing Policy: Clear reviewing guidelines for papers that disclose the use of tools like ChatGPT need to be established. For instance, how should reviewers assess clarity and presentation if a paper has been edited by ChatGPT and appropriately acknowledged?
- Guideline for Non-textual Content: There should be standard guidelines for ethically generated non-textual content such as images, graphics, computer codes, and data.
- Fair Detection Mechanism: The guidelines should specify measures to detect undisclosed AI-generated content. A recent study by Stanford researchers highlighted biases in AI detectors against non-native English speakers. It is crucial to ensure that non-native speakers of any language are not penalized by any inappropriate detection.
- Dispute Settlement Mechanism: Given the absence of robust detection and watermarking techniques, as well as the capacity of tools like ChatGPT to mimic an author’s style, disputes might arise between authors and publishers regarding fair detection and transparency. Thus, it is essential to establish an effective dispute resolution system to address fairness, transparency, and potential biases tied to AI-generated content.
In conclusion, while ChatGPT offers numerous benefits for scientific writing, it is crucial to establish standard guidelines, use cases, and policies to mitigate the associated risks effectively. Such efforts will promote responsible and ethical use of AI in scientific writing and ensure that ChatGPT becomes a valuable tool that enhances scientific progress while upholding the highest standards of integrity and quality.