Acad Radiol. 2025 Jul 17. pii: S1076-6332(25)00637-3. [Epub ahead of print]
RATIONALE AND OBJECTIVES: Large language models (LLMs) show promise in radiology through various clinical applications as well as in assisting with research manuscript development. Recent studies show 52.6% of medical researchers use LLMs in manuscript development, with non-medical researchers showing similar rates. Given concerns about hallucinations, bias, etc., many publishers now require disclosure of LLM use. While most medical imaging journals have LLM policies as of 2025, the actual disclosure rates for LLM usage remain unknown. Our study examines trends in LLM disclosure by analyzing 1998 radiology publications for LLM disclosures.
MATERIALS AND METHODS: A bibliometric analysis of nine radiology journals with LLM use disclosure requirements was performed. The study included primary investigations and secondary research while excluding short-form publications. The LLM disclosure rate was calculated overall. Logistic regression assessed temporal trends in disclosure rates, while a linear mixed effects model evaluated the relationship between disclosure status and peer review duration. Chi-square analysis examined associations between manuscript type and disclosure rates.
RESULTS: Of 1998 manuscripts, 34 (1.7%) declared LLM use. Most disclosures involved ChatGPT (32, 94.1%), primarily for readability/grammar purposes (33, 97.1%). The majority of manuscripts disclosing LLM use originated from institutions in non-English speaking countries (22, 64.7%). No significant increase in disclosure rates over time was observed (OR: 1.06 [95% CI: 0.98, 1.16], p=0.15), and no relationship with peer review duration (coefficient: -4.85, SE=11.25, p=0.67) was found. Secondary research manuscripts disclosed LLM use more frequently (3.9% vs. 1.3%, p<0.001) with a small effect size (Cramer's V: 0.08 [0.04, 1.00]).
CONCLUSION: Our findings demonstrate remarkably low disclosure rates in radiology manuscripts despite surveys indicating significant LLM adoption among researchers. This discrepancy may result from true non-use, fear of stigma, perceived advantages of undisclosed use, disagreement with disclosure requirements for minor editing, or policy unawareness, among other reasons. These findings suggest a need for more accepting research environments that recognize legitimate LLM benefits while developing nuanced disclosure policies addressing risks.
Keywords: Generative artificial intelligence; Large language models; Research disclosure