J Clin Epidemiol. 2025 Jul 18. pii: S0895-4356(25)00236-7. [Epub ahead of print] 111903
ADVANCED working group
OBJECTIVES: This study aimed to systematically map the development methods, scope, and limitations of existing artificial intelligence (AI) reporting guidelines in medicine and to explore their applicability to generative AI (GAI) tools, such as large language models (LLMs).
STUDY DESIGN AND SETTING: We reported a scoping review adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR). Five information sources were searched, including MEDLINE (via PubMed), EQUATOR Network, CNKI, FAIRsharing, and Google Scholar, from inception to December 31, 2024. Two reviewers independently screened records and extracted data using a predefined Excel template. Data included guideline characteristics (e.g., development methods, target audience, AI domain), adherence to EQUATOR Network recommendations, and consensus methodologies. Discrepancies were resolved by a third reviewer.
RESULTS: 68 AI reporting guidelines were included. 48.5% focused on general AI, while only 7.4% addressed GAI/LLMs. Methodological rigor was limited: 39.7% described development processes, 42.6% involved multidisciplinary experts, and 33.8% followed EQUATOR recommendations. Significant overlap existed, particularly in medical imaging (20.6% of guidelines). GAI-specific guidelines (14.7%) lacked comprehensive coverage and methodological transparency.
CONCLUSION: Existing AI reporting guidelines in medicine have suboptimal methodological rigor, redundancy, and insufficient coverage of GAI applications. Future and updated guidelines should prioritize standardized development processes, multidisciplinary collaboration, and expanded focus on emerging AI technologies like LLMs.
Keywords: Artificial intelligence; Generative artificial intelligence; Large language models; Reporting guidelines; Scoping review