Int J Nurs Stud. 2025 Dec 19. pii: S0020-7489(25)00332-3. [Epub ahead of print]176
105322
BACKGROUND: Clinical documentation is essential for safe, high-quality care but has become increasingly complex, contributing to clinician burnout. Large language models offer potential to ease documentation by generating summaries, structuring data, and ensuring compliance. However, concerns remain regarding accuracy, bias, privacy, and regulatory risks.
OBJECTIVE: To map current literature on large language models applications in clinical documentation, evaluating their benefits, limitations, and ethical considerations.
INFORMATION SOURCES: Five electronic databases (i.e., PubMed, Scopus, CINAHL, Cochrane Library, and IEEE Xplore) covering peer-reviewed literature published in English between January 2009 and August 2025.
METHODS: This scoping review followed Arksey and OMalleys framework and was reported in accordance with PRISMA-ScR guidelines. Screening, data extraction, and quality appraisal were conducted independently by multiple reviewers using Joanna Briggs Institute tools. Findings were synthesized using descriptive and narrative approaches.
RESULTS: Forty-one studies met inclusion criteria, most originating from the United States. Large language models were primarily applied to clinical note generation, discharge summaries, and provider-patient encounter documentation. Key evaluation metrics included content accuracy, linguistic quality, and summarization performance. Large language models demonstrated potential to improve documentation efficiency and readability, with some studies reporting up to 40 % time savings. However, concerns about factual inaccuracies, hallucinations, and reduced performance in complex cases were common. Clinician perceptions were mixed. Some found notes generated by large language models helpful and well-structured, while others raised concerns about reliability, liability, and loss of clinical nuance. Ethical challenges included data privacy, security, and algorithmic bias, with varying levels of compliance across settings.
CONCLUSIONS: Large language models hold significant promise for enhancing clinical documentation by improving efficiency, standardization, and clarity. However, their safe and effective use requires rigorous attention to accuracy, ethical safeguards, and clinician trust. Integration must support, rather than supplant, clinical reasoning and patient-centered care. Co-design with clinicians, real-world evaluation, and artificial intelligence literacy are essential to ensure that these technologies augment, not erode, professional judgment and care quality.
REGISTRATION: Open Science Framework Registries (https://osf.io/m4h3q).
Keywords: Artificial intelligence; Documentation; Electronic health records; Large language models