Res Integr Peer Rev. 2025 Apr 07. 10(1): 4
BACKGROUND: While some recent studies have looked at large language model (LLM) use in peer review at the corpus level, to date there have been few examinations of instances of AI-generated reviews in their social context. The goal of this first-person account is to present my experience of receiving two anonymous peer review reports that I believe were produced using generative AI, as well as lessons learned from that experience.
METHODS: This is a case report on the timeline of the incident, and my and the journal's actions following it. Supporting evidence includes text patterns in the reports, online AI detection tools and ChatGPT simulations; recommendations are offered for others who may find themselves in a similar situation. The primary research limitation of this article is that it is based on one individual's personal experience.
RESULTS: After alleging the use of generative AI in December 2023, two months of back-and-forth ensued between myself and the journal, leading to my withdrawal of the submission. The journal denied any ethical breach, without taking an explicit position on the allegations of LLM use. Based on this experience, I recommend that authors engage in dialogue with journals on AI use in peer review prior to article submission; where undisclosed AI use is suspected, authors should proactively amass evidence, request an investigation protocol, escalate the matter as needed, involve independent bodies where possible, and share their experience with fellow researchers.
CONCLUSIONS: Journals need to promptly adopt transparent policies on LLM use in peer review, in particular requiring disclosure. Open peer review where identities of all stakeholders are declared might safeguard against LLM misuse, but accountability in the AI era is needed from all parties.
Keywords: Academic misconduct; ChatGPT; Generative AI; LLMs; Large language models; Peer review