bims-librar 2026-03-01 papers

bims-librar

Biomed News

on Biomedical librarianship

Issue of 2026–03–01
48 papers selected by
Thomas Krichel, Open Library Society

Why every scientist needs a librarian.
Informing a medical subject heading (MeSH) request: The 'Prescribing cascades' case study.
Academic medical librarians and video consultations: our new normal.
Take a look at our CPD library.
Searching disease-related genes with Mapping of Biological Entities from Literature (MaBEL).
Development and Validation of a Search Filter to Identify Research on Bisexuality.
Knowledge-based Citation Reasoning for Biomedical Domain.
Advertising of orthodontic appliances on websites in the UK: Do they comply with advertising standards? A cross-sectional study.
PreprintToPaper dataset: connecting bioRxiv preprints with journal publications.
A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization.
AntibioticDB: An Updated and Improved Open-Access Database for the Antibacterial Research and Development Community.
A Blinded Analysis of Quality and Fidelity in Orthopaedic Patient Education Materials Simplified by ChatGPT and Humans.
Do Large Language Models Perform Equally Across Languages? A Comparison of Responses to Frequently Asked Questions in Anesthesiology.
Performance of ChatGPT-4o, Gemini 2.0 Pro, and DeepSeek-V3 in Patient-Facing Information on Chest Wall Deformities: A Comparative Evaluation of Accuracy, RELIABILITY, and Reproducibility.
Artificial intelligence vs. human evaluation of anesthesia education videos: a comparative analysis using validated quality scales.
Evaluating Artificial Intelligence-Generated Responses to Patient Questions Regarding Orthobiologic Injections.
Evaluation of ChatGPT and Gemini in Answering Patient Questions after Gynecologic Surgery.
Evaluating large language model responses to patient questions on ulnar collateral ligament repair.
Readability assessment of ChatGPT 5.0 responses are more complex for Anterior Cruciate Ligament Reconstruction compared to American Academy of Orthopedic Surgeons' OrthoInfo.
Human and Artificial Intelligence (AI) Analysis of Patient Experiences of Periodontal Graft Surgery.
Assessing Patient Education Guide Generated by ChatGPT vs Google Gemini on Common Hepatology Conditions: A Cross-sectional Study.
#infertility education on Instagram: A Reliability Analysis in Brazil Using the DISCERN Instrument.
Quality evaluation of online platforms information for retail of prescription medicines in China: an observational study of the case of paroxetine.
Readability of Online Patient Education Materials for Peripartum Cardiomyopathy.
A comprehensive analysis of online patient educational materials for lichen sclerosus.
Online Postoperative Instructions after Nonsurgical Root Canal Treatment: Readability, Usability, and Transparency.
Quality and Literacy Assessment of ASCRS Patient Education Material for Spanish and Chinese Speakers: A Call for Improvement.
Evaluating the Readability, Accessibility, and Quality of Online Health Information on Opioid Use Disorder: A Comparative Analysis of Government and Non-Government Websites.
The Dental Misinformation Epidemic: An Evaluation of Quality, Accuracy and Educational Impact in Youtube's Tooth Bleaching Content.
YouTube as a digital health platform: A cross-sectional thematic and sentiment analysis of the 100 most-viewed knee osteoarthritis videos.
Sarcopenia and Nutrition on YouTube: A Content Quality and Reliability Assessment.
Evaluation of Popularity, Reliability, and Quality of Dental and Oral Microbiology Videos on YouTube as Sources of Dental Education.
Evaluation of the usability of YouTube videos as educational materials in improving patients' metered dose inhaler use skills: A cross-sectional study.
Assessing the content and quality of gout-related videos on video platforms: A cross-sectional study.
Quality, reliability and popularity assessment of Turkish YouTube videos on human papillomavirus and its vaccine: content analysis.
Intoeing gait in children: one of the most common pediatric orthopedic complaints and an evaluation of the quality of related YouTube videos.
Evaluation of Social Media Short-Form Video Content for Patient Education on Vision-Threatening Diseases.
Exploring the Quality of TikTok-Based Diabetes Self-Management Education in English and Spanish: A Digital Health Study.
TikTok as a Platform for Patient Education and Health Information in Rare Genetic Diseases: Cross-Sectional Study.
Cracking the Code: Adolescent Insights on TikTok Health Information Videos Produced by University Health Students.
Assessment of the most popular short videos about hemorrhoids on Chinese TikTok: A cross-sectional study.
Assessment of the educational value of robotic‑assisted thoracoscopic pulmonary lobectomy videos.
Health information-seeking behavior among women with polycystic ovary syndrome: A scoping review protocol.
Helping people navigate and make sense of cancer information: challenges and perspectives.
When Patients Go to "Dr. Google" Before They Go to the Emergency Department.
Digital access and health outcomes: The moderating role of socioeconomic status in health information seeking.
From scrubs to screens: A hashtag analysis study of #Nursetok on TikTok.
Online health information seeking behaviors and self-reported health among Chinese university students: associations and sex differences.

Nature. 2026 Feb;650(8103): 1063-1065

Why every scientist needs a librarian.

Amber Dance.



Keywords:  Careers; Information technology; Research management; Scientific community

DOI:  https://doi.org/10.1038/d41586-026-00568-y
Res Social Adm Pharm. 2026 Feb 24. pii: S1551-7411(26)00032-X. [Epub ahead of print]

Informing a medical subject heading (MeSH) request: The 'Prescribing cascades' case study.

Fernando Fernandez-Llimos, Fernanda S Tonin.

   BACKGROUND: 'Prescribing cascades' describes initiation of new treatments in response to adverse drug effects, often misinterpreted as new conditions. Although Medical Subject Headings (MeSH) enable accurate indexing and retrieval, no MeSH term exists for this concept. This study assessed the need for a dedicated MeSH term and its potential impact on indexing consistency.
METHODS: A cross-sectional analysis was conducted using 139 articles from a published scoping review. PubMed metadata provided publication details and indexing status. MEDLINE-indexed articles were analyzed for MeSH attribution. Associations between the presence of keywords ("inapprop-", "adverse", "cascade") in titles/abstracts and specific MeSH terms were quantified using odds ratios (OR) with 95% confidence intervals (CI) (R/RStudio).
RESULTS: Of 139 articles included in the scoping review, 119 were indexed in PubMed and 106 in MEDLINE, with 1055 MeSH terms covering 313 unique terms. The most prevalent were "Drug-Related Side Effects and Adverse Reactions" (32.1%), "Polypharmacy" (30.2%), and "Inappropriate Prescribing" (26.4%). Indexing was manual in 47 cases, automated in 45, and curated in 14. After 2022, automated indexing reduced use of "Drug Interactions" and "Medication Errors." Only three associations demonstrated predictive values: the root "inapprop-" in abstracts predicted the MeSH "Inappropriate Prescribing" (OR = 5.1 [95% CI 1.9-13.7]); "inapprop-" in titles for "Potentially Inappropriate Medication List" (OR = 204 [95% CI 9.1-4551.3]); and "adverse" in abstracts for "Drug-Related Side Effects and Adverse Reactions" (OR = 2.8 [95% CI 1.2-6.7]). The word "cascade" in titles was associated with "Drug-Related Side Effects and Adverse Reactions" (OR = 3.6 [95% CI 1.4-9.2]).
CONCLUSION: Current evidence does not support a dedicated MeSH for 'prescribing cascades' as existing descriptors are sufficient to capture its main dimensions. Further research should prioritize definitional consensus and compositional indexing strategies to enhance retrieval accuracy.

Keywords:  Bibliographic databases; Controlled vocabulary; Information storage and retrieval; Terminology as topic

DOI:  https://doi.org/10.1016/j.sapharm.2026.02.009
Med Ref Serv Q. 2026 Feb 24. 1-16

Academic medical librarians and video consultations: our new normal.

Amelia Brunskill, Rosie Hanneke.

  Since the onset of the COVID-19 pandemic, interest in video research consultations has rapidly expanded. This study aimed to characterize the current state of academic health sciences librarians' practices and perspectives around this mode of consultation. We distributed a survey, which received 124 eligible responses. Our analysis found that video consultations were now generally in higher demand than in-person consultations and were viewed positively by respondents. Advantages mentioned included convenience & flexibility, technical capabilities, and health benefits. Disadvantages mentioned included technology challenges, issues with engagement, interruptions, and personal preference for in-person meetings.

Keywords:  Academic libraries; reference services; video consultations

DOI:  https://doi.org/10.1080/02763869.2026.2629237
Vet Rec. 2026 Feb/Jul 28;198(5):198(5): 241

Take a look at our CPD library.

DOI: https://doi.org/10.1002/vetr.70454
Comput Biol Chem. 2026 Feb 17. pii: S1476-9271(26)00083-6. [Epub ahead of print]123 108958

Searching disease-related genes with Mapping of Biological Entities from Literature (MaBEL).

Gratchela Dutra Rodrigues, Gabriel Liston de Menek, Darling de Andrade Lourenço, Frederico Schmitt Kremer.

  The exponential growth of biomedical literature presents a major challenge for systematically identifying disease-related genes and therapeutic targets. We introduce MaBEL (Mapping of Biological Entities from Literature), a scalable and adaptable text-mining platform that unifies literature retrieval, entity recognition, and data integration into a single framework. In contrast to single-source text-mining systems, MaBEL retrieves publications from PubMed, Scopus, ScienceDirect, SciELO, and major preprint servers (bioRxiv, medRxiv, arXiv, ChemRxiv), consolidating them through DOI-based deduplication to ensure comprehensive and nonredundant coverage. The platform employs modular natural language processing pipelines, combining SciSpaCy for gene and protein recognition, BioSyn for rapid alias normalization, and PubTator 3.0 for enriched semantic and relational annotation. Built on a distributed architecture using Flask, Celery, and Docker, MaBEL supports asynchronous, large-scale text processing with near real-time performance. Applied to seven major diseases, MaBEL processed over 14,000 unique articles, achieving accurate identification of high-frequency, disease-salient genes and strong concordance with Open Targets Platform association scores. This demonstrates its reliability for uncovering biologically meaningful disease-gene relationships. By integrating multi-source retrieval, scalable computation, and modular adaptability, MaBEL represents a novel, extensible framework that advances biomedical text mining beyond static, single-database approaches, facilitating rapid hypothesis generation and accelerating the discovery of molecular targets in translational research. The source codes can be accessed at https://github.com/omixlab/Mabel.

Keywords:  Automated literature retrieval; Drug discovery; Named Entity Recognition (NER)

DOI:  https://doi.org/10.1016/j.compbiolchem.2026.108958
Arch Sex Behav. 2026 Feb 25.

Development and Validation of a Search Filter to Identify Research on Bisexuality.

Jane Morgan-Daniel, Chelsea Misquith, Andy Hickner.

  Locating health research on bisexual and pansexual (bisexual+) populations is challenging, as the data are usually collated with data for additional sexual minority and gender identities. To improve the findability of health research on bisexual+ populations, this study sought to develop a sensitivity-maximizing PubMed search filter for bisexuality. Using the relative recall method for the development and validation of search filters, we used PubMed, CitationChaser, and Covidence to search for and screen studies. To be included, studies had to report bisexual/pansexual-specific data and be MEDLINE-indexed with a PubMed Identifier (PMID). Of 291 eligible records, 252 had PMIDs; these records constituted the gold standard set used to develop the search filters. Combinations of search terms were tested against the gold standard set. Sensitivity and number needed to read (NNR) were calculated for each combination. Two search filters are presented. The sensitivity-maximizing search retrieved 100% of the gold standard set, with an NNR of 129.85. The optimized search retrieved all but one of the gold standard articles (99.60%), with an NNR of 74.88. Two PubMed search filters are presented for bisexual+ populations. These filters were validated using the relative recall method against a gold standard set derived from citing and cited references of systematic reviews on bisexual+ health. Use of the sensitivity-maximizing search filter is recommended for exhaustive searches, while the optimized filter is considered more appropriate for nonexhaustive searches.

Keywords:  Bisexuality; Evidence syntheses; Pansexual; Search filters; Sexual orientation

DOI:  https://doi.org/10.1007/s10508-025-03354-5
Bioinformatics. 2026 Feb 24. pii: btag061. [Epub ahead of print]

Knowledge-based Citation Reasoning for Biomedical Domain.

Pengcheng Li, Kai Zhang, Xiaozhong Liu, Xuhong Zhang.

   MOTIVATION: Citation is central to scholarly communication, enabling researchers to navigate rapidly expanding literature and identify relevant prior work. Yet the reasoning behind why a particular paper is cited is often implicit or opaque. Although academic search engines and literature tools rank candidate papers for a query, the motivations underlying these rankings are rarely transparent, making it difficult for scholars to interpret and act on retrieved results-especially in biomedical research where domain knowledge is essential.
RESULTS: We propose an encoder-decoder framework that leverages curated biomedical knowledge to generate explanations of citation motivation in a structured bio-triplet format. We evaluate the approach against recent families of pretrained language models for text generation, including BERT-style (and variants) and GPT-style (and variants) models. In cancer-focused experiments using PubMed Central, we annotate over 10,000 citation relations with bio-triplets grounded in curated knowledge from multiple biomedical databases. Trained on these annotations, our model outperforms strong sequence-generation baselines, improving precision, recall, and F1 for citation-motivation generation.
AVAILABILITY AND IMPLEMENTATION: Code and data are available at Zenodo (archival DOI: 10.281/zenodo.14893445) and GitHub: https://github.com/zhongxiangboy/Knowledge-based-Citation-Reasoning-for-Biomedical-Domain.
SUPPLEMENTARY INFORMATION: No supplementary information is available for this manuscript.

Keywords:  biomedical literature; citation reasoning; text mining

DOI:  https://doi.org/10.1093/bioinformatics/btag061
J Orthod. 2026 Feb 26. 14653125251408302

Advertising of orthodontic appliances on websites in the UK: Do they comply with advertising standards? A cross-sectional study.

Arunika Nehra, Adam Jones, Trevor Hodge.

   AIM: To evaluate the compliance of websites promoting proprietary orthodontic appliances available in the UK against advertising standards outlined by the Advertising Standards Authority Committee of Advertising Practice (ASA CAP) Code.
DESIGN: Cross-sectional study.
SETTING: Websites promoting proprietary orthodontic appliances available in the UK, including fixed, removable and aligner systems sold as a complete system under a brand name.
METHODS: A comprehensive, systematic approach was adopted, beginning with a 2020 scoping search on Google and social media platforms (Instagram and Facebook) to identify keywords. Keyword searches were conducted on Google in 2020, 2023 and 2025 to identify relevant websites. To ensure contemporary relevance, only websites identified in the final 2025 search were included for analysis. Raters underwent training and calibration before independently evaluating websites for compliance with advertising standards using bespoke judgement criteria derived from the ASA CAP Code, across four domains: comprehensiveness of treatment information; presentation of treatment information; objectivity of treatment information; and substantiation of claims. Discrepancies were resolved through group discussion to determine agreed scores. Data were analysed using descriptive and inferential statistics (Fleiss' kappa and Kruskal-Wallis tests).
RESULTS: The 2025 search identified 970 websites, of which 39 met the inclusion criteria. Inter-rater reliability showed almost perfect agreement (kappa >0.9). Compliance varied significantly across domains: 45% of all claims provided comprehensive information, 54% had clear presentation, 38% maintained objectivity and only 4% of claims were substantiated with evidence. Nearly all websites (95%) omitted common risks and 92% failed to mention alternative treatments. Direct-to-consumer and tele-dentistry websites showed poorer compliance than dentist-delivered systems.
CONCLUSIONS: Orthodontic appliance websites showed poor compliance with ASA CAP Code standards. The majority used descriptive language and words in place of numbers to quantify magnitude, alongside subjective content and unsubstantiated claims, with omissions of treatment risks. These findings raise significant concerns about online orthodontic advertising and its potential impact on informed patient decision-making.

Keywords:  advertising; aligners; direct-to-consumer; orthodontics; quality of information; tele-dentistry

DOI:  https://doi.org/10.1177/14653125251408302
Sci Data. 2026 Feb 24. pii: 301. [Epub ahead of print]13(1):

PreprintToPaper dataset: connecting bioRxiv preprints with journal publications.

Fidan Badalova, Julian Sienkiewicz, Philipp Mayr.

The PreprintToPaper dataset connects bioRxiv preprints with their corresponding journal publications, enabling large-scale analysis of the preprint-to-publication process. It comprises metadata for 145,517 preprints from two periods, 2016-2018 (pre-pandemic) and 2020-2022 (pandemic), retrieved via the bioRxiv and Crossref APIs. We selected the two periods to capture preprint-publication dynamics before and during the COVID-19 pandemic while avoiding transitional years. Each record includes bibliographic information such as titles, abstracts, authors, institutions, submission dates, licenses, and subject categories, alongside enriched publication metadata including journal names, publication dates, author lists, and further information. In addition to the main dataset, a version-history subset provides all available versions of preprints within the two selected periods, enabling analysis of how preprints evolve over time. Preprints are categorized into three groups: Published (formally linked to a journal article), Preprint Only (posted on a preprint server), and Gray Zone (potentially published in a journal but unlinked). To enhance reliability, title and author similarity scores were computed, and a human-annotated subset of 299 records was created to evaluate Gray Zone cases. The dataset supports diverse applications, including studies of scholarly communication, open science policies, bibliometric tool development, and natural language processing research on textual changes between preprints and their corresponding journal articles.

DOI: https://doi.org/10.1038/s41597-026-06867-3
Sci Data. 2026 Feb 25.

A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization.

Sarvesh Soni, Dina Demner-Fushman.

Patient's unique information needs about their hospitalization can be addressed using clinical evidence from electronic health records (EHRs) and artificial intelligence (AI). However, robust datasets to assess the factuality and relevance of AI-generated responses are lacking and, to our knowledge, none capture patient information needs in the context of their EHRs. To address this gap, we introduce ArchEHR-QA, an expert-annotated dataset of 134 cases from intensive care unit and emergency department settings. Cases comprise patient questions from public health forums, clinician-interpreted versions, relevant clinical note excerpts with sentence-level relevance annotations, and clinician-authored answers. To establish benchmarks for grounded EHR question answering (QA), we evaluated three open-weight large language models (Llama 4, Llama 3, and Mixtral) across three prompting strategies. We assessed performance on two dimensions: Factuality (overlap between cited and ground truth evidence) and Relevance (similarity to reference answers).

DOI: https://doi.org/10.1038/s41597-026-06639-z
ACS Infect Dis. 2026 Feb 23.

AntibioticDB: An Updated and Improved Open-Access Database for the Antibacterial Research and Development Community.

Luiza H Galarion, Alan Hennessy, Simon D Harding, Jane F Armstrong, Astrid Pentz-Murr, Jamie A Davies, Alex J O'Neill, Laura J V Piddock.

  AntibioticDB (https://www.antibioticdb.com/), originally established in 2017 and since 2021 led by the Global Antibiotic Research & Development Partnership (GARDP), is a freely available database of antibacterial agents to facilitate research and development of new antibacterial therapeutics. Here, we describe a new release of AntibioticDB that has been significantly expanded and updated with the aid of user feedback and which offers additional functionality through a redesigned web portal. Improvements include reciprocal integration with the IUPHAR/BPS Guide to Pharmacology (https://www.guidetopharmacology.org), capturing of compound structure information in the form of standard chemical identifiers (canonical and isomeric SMILES, InChI, and InChI Key), chemical 2D structure images, and harmonizing terminology to optimize database searching. Ongoing curation efforts have increased the number of individual entries to >3,500, a process driven mostly by a significant expansion of historical natural product antibiotics that were previously under-represented in the database. The database is continuously updated by mining the published literature and capturing newly discovered antibacterial compounds as they are reported, making AntibioticDB the most complete global resource on antibacterial agents.

Keywords:  AntibioticDB; antibacterial; antibiotic; chemical; database; structure

DOI:  https://doi.org/10.1021/acsinfecdis.5c00955
J Bone Joint Surg Am. 2026 Feb 26.

A Blinded Analysis of Quality and Fidelity in Orthopaedic Patient Education Materials Simplified by ChatGPT and Humans.

Joseph E Nassar, Maxwell Sahhar, Lama A Ammar, Joseph Carroll, Anne-Emilie Rouffiac, Marco Kaper, Alex Hernandez-Manriquez, Manjot Singh, Edward Akelman, Alan H Daniels.

BACKGROUND: Orthopaedic patient education materials (PEMs) within Epic's Elsevier library often exceed the recommended sixth-grade reading level, with a mean grade of 8.6 in English and 5.8 in Spanish, risking poor patient comprehension and adherence. The present study evaluated whether artificial intelligence (AI)-based text simplification can improve readability while preserving clinical accuracy. The objectives were to use previously established readability data for English and Spanish PEMs as baselines, to assess the impact of human-based and ChatGPT-based simplification on reading grade level, and to compare the fidelity of simplified texts against standard materials.
METHODS: In March 2025, 806 orthopaedic PEM documents were simplified using standardized ChatGPT prompts. Readability was reassessed using validated English and Spanish formulas, and fidelity was evaluated in the 86 PEMs that also had human easy-to-read versions. Two blinded clinicians compared human and ChatGPT-4o outputs with the originals to identify hallucinations, omissions, and inconsistencies according to severity. Following the release of ChatGPT-5, an unblinded post hoc analysis was performed using identical criteria.
RESULTS: ChatGPT-4o-simplified PEMs showed mean reading grade levels of 6.1 in English and 3.5 in Spanish. Compared with human simplifications, ChatGPT-4o showed fewer English omissions, similar Spanish omissions, fewer inconsistencies in both languages, and comparable English hallucinations, but higher Spanish hallucinations. Compared with ChatGPT-4o, ChatGPT-5 preserved English performance and improved Spanish fidelity, reducing hallucinations to human-comparable rates.
CONCLUSIONS: AI-driven simplification can produce orthopaedic PEMs that are easier to read while maintaining acceptable fidelity. The improvements observed with ChatGPT-5 highlight its potential for clinician-supervised use in generating accessible and reliable PEMs.
CLINICAL RELEVANCE: This study is clinically relevant because orthopaedic PEMs are routinely delivered through the Epic electronic health record and directly affect patient understanding, consent, and adherence in both English and Spanish. By evaluating the readability and fidelity of AI-simplified materials across languages, this study informs safe, scalable strategies to improve patient communication in everyday orthopaedic practice.

DOI: https://doi.org/10.2106/JBJS.25.00982
Med Sci Monit. 2026 Feb 28. 32 e951815

Do Large Language Models Perform Equally Across Languages? A Comparison of Responses to Frequently Asked Questions in Anesthesiology.

Hadi Ufuk Yörükoğlu, Can Aksu, Pervez Sultan, Serkan Tulgar.

BACKGROUND With the increasing use of large language model (LLM) chatbots in healthcare, evaluating their ability to provide reliable and understandable information in multiple languages is critical, particularly in fields such as anesthesia, where patient education is essential. The study primarily aimed to compare the quality of ChatGPT 4.0's and DeepSeek V3's English responses, with secondary aims to evaluate content and communication differences between English and Turkish responses. MATERIAL AND METHODS Anesthesiologists proficient in both languages were recruited as experts. Ten frequently asked questions in anesthesia were selected and translated for evaluation. Responses from ChatGPT 4.0 and DeepSeek V3 in both English and Turkish were assessed for overall quality and content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics), and Turkish and English responses were compared by the evaluators. RESULTS Eleven experts evaluated the responses. English responses of ChatGPT 4.0 were superior to the English responses of DeepSeek V3 in overall (P<0.001). English responses of ChatGPT 4.0 were superior to the Turkish responses in the terms of overall, content, and communication quality (P<0.001 each) and English responses of DeepSeek V3 were superior to the Turkish responses in the terms of overall (P<0.001), content (P<0.001) and communication (P=0.001) quality. CONCLUSIONS ChatGPT 4.0 performed better than DeepSeek V3 in the English language in terms of overall quality of responses to 10 frequently asked questions in the field of anesthesia and the English responses provided by ChatGPT 4.0 and DeepSeek V3 outperformed the Turkish responses.

DOI: https://doi.org/10.12659/MSM.951815
Diagnostics (Basel). 2026 Feb 15. pii: 589. [Epub ahead of print]16(4):

Performance of ChatGPT-4o, Gemini 2.0 Pro, and DeepSeek-V3 in Patient-Facing Information on Chest Wall Deformities: A Comparative Evaluation of Accuracy, RELIABILITY, and Reproducibility.

Deniz Oke, Ozge Gulsum Illeez, Esra Giray, Betül Çiftçi.

  Background: Large language models (LLMs) such as DeepSeek-V3, Google Gemini 2.0 Pro, and ChatGPT-4o are increasingly used by patients seeking online medical information. However, their accuracy, reliability, and reproducibility in patient-facing content related to chest wall deformities (CWD) remain unclear. This study aimed to compare the performance of three contemporary LLMs in generating information on pectus excavatum, pectus carinatum, and related thoracic deformities. Methods: Eighty patient-facing questions were developed across eight thematic domains and independently submitted to each model using newly created accounts over two consecutive days. Accuracy was assessed using a validated four-point rubric by blinded physiatrists, and reproducibility was evaluated using agreement metrics and weighted Cohen's kappa. Results: ChatGPT-4o achieved the highest overall accuracy (median score: 1.20), the greatest proportion of fully accurate responses, and the lowest hallucination rate (5.0%). Gemini showed intermediate accuracy, while DeepSeek-V3 demonstrated the lowest accuracy and highest hallucination rate (11.25%). Across all models, general-information and quality-of-life domains had the best performance, whereas treatment-related questions showed the most errors. Reproducibility was highest for ChatGPT-4o (weighted κ = almost perfect), followed by Gemini and DeepSeek-V3. Inter-rater reliability was substantial (Fleiss' κ = 0.69). Conclusions: Contemporary LLMs can generate largely accurate and reproducible patient-facing information on CWD, with ChatGPT-4o showing the strongest overall performance. This study provides the first domain-specific comparative evaluation of LLMs in CWD and integrates reproducibility analysis alongside accuracy and reliability assessment. While these tools may support patient education, treatment-related responses require caution, and LLMs should be used as adjuncts rather than substitutes for clinical counseling.

Keywords:  ChatGPT-4o; DeepSeek-V3; Google Gemini 2.0 Pro; accuracy; artificial intelligence; chest wall deformity; hallucinations; patient education; reproducibility

DOI:  https://doi.org/10.3390/diagnostics16040589
Front Med (Lausanne). 2026 ;13 1752664

Artificial intelligence vs. human evaluation of anesthesia education videos: a comparative analysis using validated quality scales.

Kubra Taskin, Hulya Yilmaz Ak.

   Background: YouTube has become an increasingly popular platform for medical education, yet the accuracy and educational quality of anesthesia-related videos remain uncertain. While human experts have traditionally assessed video quality using validated scales such as DISCERN, JAMA, and the Global Quality Scale (GQS), artificial intelligence (AI) models-particularly large language models (LLMs)-now offer new possibilities for scalable, objective content evaluation. This study aimed to compare the educational quality of anesthesia education videos produced by humans and AI, and to examine the level of agreement between human expert ratings and ChatGPT-5 evaluations.
Methods: In this cross-sectional analytical study, forty YouTube videos were analyzed: 20 produced by human educators and 20 generated using AI tools. Each video was independently assessed by two anesthesiologists and by ChatGPT-5 Plus (OpenAI, 2025) using DISCERN, JAMA, and GQS criteria. Inter-rater reliability between human evaluators was determined using the Intraclass Correlation Coefficient (ICC), and correlations between human and AI ratings were analyzed with Spearman's rho.
Results: Human-generated videos scored significantly higher than AI-generated ones in DISCERN (68.45 ± 4.60 vs. 62.77 ± 7.32, p = 0.0044, Cohen's d = 0.82) and JAMA (3.70 ± 0.41 vs. 3.23 ± 0.77, p = 0.0446, Cohen's d = 0.71) scores, whereas no significant difference was observed in GQS scores (p = 0.3033). Inter-rater reliability between human experts was excellent (ICC = 0.81-0.86, p < 0.001). Strong correlations were found between ChatGPT-5 and the human mean scores for all scales (ρ = 0.897 for DISCERN, ρ = 0.785 for GQS, ρ = 0.765 for JAMA; p < 0.001), indicating high agreement between AI and human evaluations.
Conclusion: AI-based models such as ChatGPT-5 show potential to approximate human expert judgment in evaluating educational content. While human-generated videos remain superior in terms of source transparency and ethical reporting, AI-generated content approaches human quality in structural organization and linguistic fluency. These findings suggest that AI-assisted evaluation systems may serve as standardized, efficient tools for quality screening of large-scale educational video archives in medical education.

Keywords:  ChatGPT-5; DISCERN scale; YouTube; anesthesia education; artificial intelligence; medical video quality

DOI:  https://doi.org/10.3389/fmed.2026.1752664
Orthop J Sports Med. 2026 Feb;14(2): 23259671251414852

Evaluating Artificial Intelligence-Generated Responses to Patient Questions Regarding Orthobiologic Injections.

Benjamin W King, Jesse Seilern Und Aspang, Kyle Hammond, Destin Hill, Prathap Jayaram, Jay Patel, Richard M Danilkowicz.

   Background: Patient interest in orthobiologic injections continues to grow. While modern patients are increasingly reliant on artificial intelligence (AI) large language models (LLMs) for health information, it remains unclear whether AI-generated responses regarding orthobiologics are both accurate and written at a reading level suitable for patient education.
Purpose: To assess the accuracy and readability of responses to common patient questions regarding orthobiologic injections from 3 popular AI LLMs (ChatGPT, Gemini, and Grok).
Study Design: Cross-sectional Study.
Methods: Responses to 20 common patient questions regarding orthobiologic injections were recorded from ChatGPT 4o, Gemini 2.5 Flash, and Grok 3 in July 2025. Four independent reviewers (2 fellowship-trained sports medicine orthopaedic surgeons and 2 fellowship-trained nonoperative sports medicine physicians) assessed AI responses for accuracy using the ChatGPT Response Rating System (CRRS) and the AI Response Metric (AIRM). Readability of responses was assessed using the Flesch-Kincaid Grade Level (FKGL).
Results: Interrater reliability was strong for all accuracy ratings (ICCs >0.70; P < .05). While response accuracy was generally acceptable, 50% (10/20) of ChatGPT, 25% (5/20) of Gemini, and 30% (6/20) of Grok responses were deemed as requiring more than minimal clarification (CRRS >2). One-way matched analysis of variance (ANOVA) revealed a significant effect of AI model on both CRRS (P = .02) and AIRM scores (P = .02), with Gemini displaying improved accuracy compared with ChatGPT (CRRS, P = .04; AIRM, P = .03). Regarding readability, the mean FKGL of all 3 models was at a collegiate level or higher, and all responses exceeded the American Medical Association and National Institutes of Health-recommended 6th-grade reading level for patient education. One-way matched ANOVA revealed a significant effect of AI model on FKGL (P = .02), with Gemini displaying reduced readability compared with ChatGPT (P = .03).
Conclusion: In this study, ChatGPT, Gemini, and Grok provided generally accurate information on orthobiologics but failed to produce responses at a patient-appropriate readability level. Gemini outperformed ChatGPT in accuracy, although all 3 models demonstrated significant limitations in clarity. Until these issues are resolved, AI-generated responses should serve only as supplemental resources, with final patient education directed by physicians.

Keywords:  artificial intelligence; concentrated bone marrow aspirate; orthobiologics; platelet-rich plasma

DOI:  https://doi.org/10.1177/23259671251414852
Appl Clin Inform. 2026 Feb 24.

Evaluation of ChatGPT and Gemini in Answering Patient Questions after Gynecologic Surgery.

Petra C Voigt, Rhea Sharma, Angela Chaudhari, Susan Tsai, Magdy P Milad, Linda C Yang.

OBJECTIVE: To explore the performance of ChatGPT version 4.0 (GPT-4) and Gemini Advanced (Gemini) large language models (LLMs) in addressing common patient questions after gynecology surgery with regards to accuracy, relevance, helpfulness, and readability.
METHODS: In this cross-sectional study, two LLMs were prompted to generate answers to post-operative patient questions after gynecologic surgery. Post-operative patient questions were developed to simulate common patient questions after gynecologic surgery, based on expert opinion and compiled from anonymous posters on Reddit (r/endometriosis). Six topics were emphasized: endometriosis, vaginal bleeding, bowel/bladder function, incision care, resumption of activities, and sexual function. Questions were asked in a systematic submission process with the memory reset after each query. Responses were blinded and independently assessed for accuracy and relevance on a 5 Point Likert scale by four board-certified gynecologic surgeons with fellowship training in gynecologic surgery. Readability was calculated with the Flesch Kincaid grade level. Responses were also evaluated by three clinic nurses.
RESULTS: 41 questions were posed to GPT-4 and Gemini three times. These responses were independently evaluated by four surgeons and three nurses leading to a total of 1,968 evaluations for accuracy, relevance, helpfulness to the average patient, and readability. Surgeons and nurses graded Gemini responses as more accurate (4.23 vs 4.03, p=0.015) and helpful (4.37 vs 4.21, p=0.025) than GPT-4 responses. Responses from both models were similarly found to be relevant or very relevant (4.45 vs 4.36, p=0.2). Most responses by GPT-4 (85%) and Gemini (87%) were consistent across all questions. The average reading level for GPT-4 and Gemini responses were 11th and 10th grade, above the recommended 6th grade reading level for patient information.
CONCLUSION: GPT-4 and Gemini provided overall accurate, relevant, and helpful responses to common post-operative patient questions for gynecologic surgery. Gemini outperformed GPT-4 in accuracy and helpfulness and had more readable responses.

DOI: https://doi.org/10.1055/a-2818-1611
Clin Shoulder Elb. 2026 Mar;29(1): 71-79

Evaluating large language model responses to patient questions on ulnar collateral ligament repair.

Benjamin W King, Evan P Bailey, Eric Warren, Grant Garrigues, Kyle Hammond, Richard M Danilkowicz.

   BACKGROUND: The incidence of ulnar collateral ligament (UCL) repair continues to increase, so evaluating the accuracy and readability of information about this procedure that is produced by artificial intelligence (AI) models is important. This study assesses AI-generated responses to common patient questions about UCL repair.
METHODS: Twenty patient questions frequently encountered in clinical practice were submitted to ChatGPT, Gemini, and Grok. Three fellowship- trained orthopedic surgeons independently rated answer accuracy using the ChatGPT Response Rating System (CRRS) and AI Response Metric (AIRM), which assign scores from 1-5, with lower scores indicating better accuracy. Responses with CRRS >2 were classified as requiring more than minimal clarification. Readability was evaluated using the Flesch-Kincaid Reading Ease (FKRE) and Grade Level (FKGL) metrics. Responses with an FKGL >6 exceeded the American Medical Association (AMA) and National Institutes of Health (NIH) recommended 6th grade reading level for patient education materials.
RESULTS: More than minimal clarification was required for 15% (3/20) of ChatGPT, 5% (1/20) of Gemini, and 40% (8/20) of Grok responses. Gemini (CRRS, 1.5±0.5; AIRM, 1.6±0.5) demonstrated significantly better accuracy than ChatGPT (CRRS, 2.0±0.4; P=0.0002; AIRM, 2.2±0.5; P=0.0001) and Grok (CRRS, 2.1±0.7; P=0.005; AIRM, 2.4±0.8; P=0.002). All responses exceeded the AMA/NIH 6th grade reading level threshold (FKGL >6). Gemini produced the highest FKGL (16.2±2.2), significantly higher than ChatGPT (14.4±1.6, P=0.005) and Grok (14.6±1.7, P=0.017). FKRE did not differ significantly among models (P=0.14).
CONCLUSIONS: AI models generated generally accurate information about UCL repair but at reading levels far above the AMA/NIH recommendations. In this study, Gemini was the most accurate model and produced the least readable content. Level of evidence: III.

Keywords:   Collateral ligament; Comprehension; Patient education; Ulna; Artificial intelligence

DOI:  https://doi.org/10.5397/cise.2025.01214
J ISAKOS. 2026 Feb 19. pii: S2059-7754(26)00021-0. [Epub ahead of print] 101085

Readability assessment of ChatGPT 5.0 responses are more complex for Anterior Cruciate Ligament Reconstruction compared to American Academy of Orthopedic Surgeons' OrthoInfo.

Brandon McMaster, Sandeep Yanamala, Rakan Alshaibi, Kelechi Okoroha, Mario Hevesi, Daniel B F Saris, Aaron Krych, Adam J Tagliero.

INTRODUCTION/OBJECTIVES: ChatGPT shows promise as a search tool and a source of patient information. This study aims to evaluate the readability of information about Anterior Cruciate Ligament Reconstruction (ACLR) available through ChatGPT 5.0 and compare it with the readability of information provided by the American Academy of Orthopedic Surgeons (AAOS).
METHODS: ACLR was chosen due to its extensive coverage on the AAOS OrthoInfo website. The same subsection formats found on the AAOS site were used to query ChatGPT 5.0. The information gathered from both AAOS and ChatGPT 5.0 was analyzed for readability using various established tests: Coleman-Liau, Flesch-Kincaid, Flesch Reading Ease Index, FORCAST Readability Formula, Fry Graph, Gunning Fog Index, Raygor Readability Estimate, and the Simple Measure of Gobbledygook (SMOG) Readability Formula.
RESULTS: The analysis showed that the average reading grade level for ACL reconstruction information on the AAOS OrthoInfo website was 10.2 ± 1.2, suitable for a high school sophomore. The average reading ease score was 56.9 ± 14.2, categorized as "fairly difficult." In contrast, the average reading grade level for ChatGPT's ACL reconstruction information was 12.9 ± 1.6, indicating a college-level reading requirement, with a reading ease score of 38.1 ± 4.1, falling in the "difficult" category. There was a statistically significant difference (p<0.01), Cohen's d = 1.91, in both reading grade level and reading ease between the AAOS and ChatGPT sources.
CONCLUSIONS: This study demonstrates that the readability of ChatGPT 5.0-generated information regarding anterior cruciate ligament reconstruction is higher than that found on the AAOS OrthoInfo website, requiring a higher level of education for comprehension. Clarity and completeness are both critical elements of a tool being used by patients for educational purposes, while the information may be readily available it currently demonstrates poor readability for patients which may contribute to decisional conflicts and the development of excessive patient concern.
LEVEL OF EVIDENCE: IV.

DOI: https://doi.org/10.1016/j.jisako.2026.101085
Dent J (Basel). 2026 Feb 23. pii: 127. [Epub ahead of print]14(2):

Human and Artificial Intelligence (AI) Analysis of Patient Experiences of Periodontal Graft Surgery.

William W N Mak, Timothy Budden, Sushil Kaur, Maurice J Meade.

  Background/Objectives: The prominent role the internet plays in being a source of dental information prompts qualitative evaluation of relevant online content. This study aimed to explore patients' experience regarding periodontal graft surgery communicated through the social media platform YouTube. Methods: An initial YouTube search using the term "gum surgery experience" retrieved 40 videos. Graft surgery was the most frequently discussed procedure, and 19 relevant videos were included in the qualitative analysis. Video content was analysed using a combined human-centered and artificial intelligence (AI)-assisted approach. AI-supported analysis of viewer comments was conducted using ChatGPT-4 and Gemini-1.5 Pro. Themes generated by human and AI analyses were compared. Results: Nine key themes were identified from the 19 videos that satisfied selection criteria. Most themes were similar between human and AI analyses, with six overlapping and three unique. The most frequently coded theme was post-operative recovery (n = 177), with pain, work absence, eating difficulties, and disrupted oral hygiene commonly reported. Patient-clinician relationships were frequently highlighted, with mixed experiences regarding communication and trust. Positive experiences were reported more frequently than negative. Comment analysis revealed varied audience engagement and sentiments, emphasizing concerns about pain, recovery, and procedural anxiety. Conclusions: Key themes related to patient experiences were identified, notably concerns regarding post-operative recovery and patient-clinician relationships. Challenges in finding information prior to having surgeries motivated patients to provide support and advice on YouTube, emphasizing the need for patient-centered resources and effective patient-clinician communication. Integrating human and AI methods in qualitative analysis was efficient and insightful, with AI supplementing but not substituting human research.

Keywords:  YouTube; artificial intelligence; patient experience; periodontal surgery; qualitative analysis; quality of information; social media

DOI:  https://doi.org/10.3390/dj14020127
Euroasian J Hepatogastroenterol. 2025 Jul-Dec;15(2):15(2): 173-177

Assessing Patient Education Guide Generated by ChatGPT vs Google Gemini on Common Hepatology Conditions: A Cross-sectional Study.

Priyal Patel, Aamuktha Marepalli, Krithika Nathan, Jahnavi Akkaldevi, Niraj Balakrishnan.

   Aims and objectives: To compare ChatGPT and Google Gemini-generated patient education guide on hepatitis, cirrhosis, and non-alcoholic fatty liver disease.
Introduction: As artificial intelligence (AI) is becoming more integrated into healthcare, assessing the quality of health information it generates is important. This study evaluates patient information guides produced by ChatGPT and Google Gemini for common hepatology conditions, focusing on accessibility, clarity, and comprehensiveness.
Methodology: Guides from both AI systems were evaluated using Flesch-Kincaid readability tests, Quillbot for similarity scores, and the DISCERN score for reliability. A quantitative analysis was conducted on various parameters, including word and sentence counts.
Results: ChatGPT generated significantly more words and sentences than Google Gemini, indicating more extensive content. However, there were no statistically significant differences in average words per sentence, syllable count, grade level, ease score, similarity percentage, or reliability scores, suggesting comparable complexity and consistency between the two models.
Conclusions: The findings underscore the need to refine AI-generated health information to meet diverse patient needs. While AI shows promise in enhancing patient education, continuous evaluation and adaptation are essential to ensure clarity and balance in the information provided. Recommendations include improving content accessibility and reliability for optimal patient engagement.
How to cite this article: Patel P, Marepalli A, Nathan K, et al. Assessing Patient Education Guide Generated by ChatGPT vs Google Gemini on Common Hepatology Conditions: A Cross-sectional Study. Euroasian J Hepato-Gastroenterol 2025;15(2):173-177.

Keywords:  Artificial intelligence; ChatGPT; Cirrhosis; Educational tool; Gastroenterology; Google Gemini; Hepatitis; Hepatology; Non-alcoholic fatty liver disease; Patient education brochure

DOI:  https://doi.org/10.5005/jp-journals-10018-1482
JBRA Assist Reprod. 2026 Feb 02.

#infertility education on Instagram: A Reliability Analysis in Brazil Using the DISCERN Instrument.

Márcia Mendonça Carneiro, Renata Bossi, Debora Alvarenga, Ana Carolina Xavier, Rodrigo Hurtado, Marcos Sampaio, Marisa Mendonça Carneiro.

   OBJECTIVE: To assess the prevalence, authorship, and reliability of educational fertility-related information shared on Brazilian Instagram, measured by the DISCERN instrument.
METHODS: This cross-sectional study analyzed public fertility-related educational Instagram posts published in March 2021 related to the following hashtags: #infertility, #ivf, #endometriosis, #tryingtoconceive, #maternity, #humanreproduction, #pregnancy, #invitrofertilization, #assistedreproduction, #pregnant, #difficulttogetpregnant. Educational posts were evaluated using the DISCERN tool. Authorship was categorized as either healthcare professional (HCP) or layperson (LP).
RESULTS: The majority of the 37 analyzed posts were authored by HCP (n = 33; 89.2%), with 24 of these being physicians, five fertility clinics, four allied HCP, one magazine, and three LP. Most posts (n = 15) focused on fertility treatments; other topics included information about diseases, exams, and diagnosis. The mean Discern analysis scores for each question were: (1) Are the aims clear? 4.86; (2) Does it achieve its aims? 4.75; (3) Is it relevant? 4.47; (4) Is it clear what sources of information were used to compile the publication (other than the author or producer)? 1.58; (5) Is it clear when the information used or reported in the publication was produced? 1.59; (6) Is it balanced and unbiased? 1.59; (7) Does it provide details of additional sources of support and information? 1.04; (8) Does it refer to areas of uncertainty? 1.39.
CONCLUSIONS: Although physicians authored most posts that clearly stated the aim and relevance, important issues such as the source of information used and details about additional sources of support and information were not available. The posts analyzed here failed to be balanced and unbiased and did not inform about potential areas of uncertainty.

Keywords:  infertility; patient education; social media

DOI:  https://doi.org/10.5935/1518-0557.20260006
BMC Med Inform Decis Mak. 2026 Feb 21.

Quality evaluation of online platforms information for retail of prescription medicines in China: an observational study of the case of paroxetine.

Han Yao, Chutong Li, Xiaonan Shi, Bo Peng, Zhenyang Kang, Jing Sun, Yuanli Liu.

   OBJECTIVES: This study aimed to generate evidence to help improve the quality of information provided by Chinese online platforms for retail of prescription medicines, as a baseline for enhancing the supervision by the regulatory authorities of internet retail of prescription medicines to offer comprehensive, accurate and reliable information to consumers.
METHODS: This is an observational study to investigate the quality of the information provided by two key types of online platforms in China that sell paroxetine online, a common mental health medication. The authors utilized the DISCERN with the Likert 5 scale, a validated instrument comprising three domains and 16 items to assess the comprehensiveness, accuracy and reliability of information provided by a total of 34 Chinese online platforms. We firstly examined whether these online platforms have been diligent in their duty to provide appropriate retail service-related information (e.g. delivery, payment, return policies) for prescription medications. We also evaluated the quality of the medication information associated with treatment plans for specific prescriptions provided by the online platforms. Three trained master medical students independently conducted the evaluation. By securing the consistency of the scoring results among different evaluators and the reliability of the evaluation, we adopted the mean score of the three evaluators for each item to perform further analysis at the item and domain levels, and between two types of online platforms.
RESULTS: Among the 34 online platforms, 2 were evaluated as providing excellent information for retail of paroxetine (1/34, 5.88%), 6 were good (6/34, 17.65%), 25 were fair (25/34, 73.53%), and 1 was poor (1/34, 2.94%). The key identified flaws of the online platforms for providing appropriate information for retail of paroxetine included (1) inadequate provision of medication information, (2) absence of prescription validation and authentication, and (3) lack of professional pharmacy services such as mandatory pharmacist consultations or patient-specific drug counselling.
CONCLUSIONS: Consumers face challenges in obtaining comprehensive, accurate and reliable information from the online platforms for common mental health prescription medications. The quality of information provided by the online platforms for retail of prescription medicines is generally suboptimal, and the compliance of prescription validation and authentication is problematic. These highlight the need for strengthening the supervision of the online platforms with novel approaches along with their rapid developments, and ensuring appropriate online information provided to consumers.

Keywords:  DISCERN; E-commerce; Information; Online; Paroxetine hydrochloride; Platform; Prescription medicines; Quality; Retail

DOI:  https://doi.org/10.1186/s12911-026-03386-4
Am J Med Sci. 2026 Feb 19. pii: S0002-9629(26)00075-3. [Epub ahead of print]

Readability of Online Patient Education Materials for Peripartum Cardiomyopathy.

Mansunderbir Singh, Sahith Reddy Thotamgari, Anna Camille Tatum, Priyanka Anvekar, Kalgi Modi.



Keywords:  Health literacy; Heart failure; Online health information; Patient education; Patient-centered care; Peripartum cardiomyopathy; Readability

DOI:  https://doi.org/10.1016/j.amjms.2026.02.010
Int J Womens Dermatol. 2026 Mar;12(1): e251

A comprehensive analysis of online patient educational materials for lichen sclerosus.

Jennifer Foster, Hannah R Chang, Melissa M Mauskar.



Keywords:  actionability; lichen sclerosus; patient education; readability; understandability

DOI:  https://doi.org/10.1097/JW9.0000000000000251
J Endod. 2026 Feb 25. pii: S0099-2399(26)00076-2. [Epub ahead of print]

Online Postoperative Instructions after Nonsurgical Root Canal Treatment: Readability, Usability, and Transparency.

Tara Boroumand, Mohammad A Sabeti.

   INTRODUCTION: Postoperative recovery after root canal treatment (RCT) relies on patients' ability to interpret instructions. However, the readability, usability, and transparency of online postoperative instructions for nonsurgical RCT are unclear. This study evaluated their readability, understandability, actionability, and transparency using a standardized Google search.
METHODS: We performed a cross-sectional analysis of online postoperative instructions for nonsurgical RCT from the first 100 Google search results. Readability was assessed using four grade-level formulas and summarized as an Average Grade Level (AGL). Understandability and actionability were evaluated using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P), and transparency was assessed against Journal of the American Medical Association (JAMA) benchmarks. Outcomes were compared with recommended thresholds and between practice types.
RESULTS: Sixty-three webpages met inclusion criteria. Mean AGL was 11.49; no webpage met the recommended sixth-grade reading level. Endodontic practice webpages were less readable than general practice webpages (AGL 11.82 vs 11.16; P = .022). Mean PEMAT-P understandability and actionability were 74.34% and 60.16%; 47/63 webpages (75%) met the understandability benchmark, but 7 (11%) met the actionability benchmark. Readability was not correlated with PEMAT-P scores. JAMA transparency scores were low; most webpages met only one criterion, and none met all four.
CONCLUSIONS: Online postoperative instructions for nonsurgical RCT require reading levels above recommended targets, offer limited actionable information, and lack transparency. Endodontic practice webpages are less readable than general practice webpages, yet they do not provide better understandability, actionability, or transparency. These findings highlight the need for guideline-based, low-literacy, actionable postoperative instructions.

Keywords:  Postoperative instruction; health literacy; internet; patient education; root canal treatment

DOI:  https://doi.org/10.1016/j.joen.2026.02.013
Dis Colon Rectum. 2026 Feb 24.

Quality and Literacy Assessment of ASCRS Patient Education Material for Spanish and Chinese Speakers: A Call for Improvement.

Ana Sofia Ore, Victoria Yuan, Pia Canal Zarate, Steven Y Chao.

DOI: https://doi.org/10.1097/DCR.0000000000004122
Adv Health Inf Sci Pract. 2026 ;2(1): eSVXO9846

Evaluating the Readability, Accessibility, and Quality of Online Health Information on Opioid Use Disorder: A Comparative Analysis of Government and Non-Government Websites.

Fatema Z Ahmed, Alice Noblin, Kendall Courtelyou, Georgiana Tynes.

   Background: The opioid crisis in the US remains a critical public health issue. Accessible, high-quality online health information is vital for educating the public about opioid use disorder (OUD) and available treatment options. This study aims to evaluate and compare the readability, accessibility, and quality of OUD-related information on government and non-government websites to identify areas for improvement.
Methods: A total of 30 websites, 21 government-operated and 9 non-governmental, were selected through a systematic search. Readability was assessed using Gunning Fog, Simple Measure of Gobbledygook (SMOG), and Flesch Reading Ease Score (FRES) tests. Accessibility was evaluated using the WAVE tool, which checks for errors and adherence to Web Content Accessibility Guidelines (WCAG). Quality was measured using the DISCERN Instrument, a standardized tool for evaluating the quality of health information. Data were analyzed for statistical differences between government and non-government websites.
Results: The average readability grade level across all websites was 14.63 and average FRES score was 31.63, indicating content requiring advanced reading skills, with no significant difference between government and non-government websites. Accessibility issues were more prevalent on non-government websites. Government websites scored significantly lower in terms of quality, with 80.95% of them rated as "poor" or "very poor" compared with 22.22% of non-government websites. The overall quality scores for both groups remained suboptimal, with an average DISCERN score of 39.52 out of 80.
Conclusions: The findings highlight a critical need for improvements in the readability, accessibility, and quality of online health information on OUD. Government websites, in particular, require enhancements to ensure they meet the public's need for reliable, accessible, and comprehensible health resources.

Keywords:  Readability; accessibility; health literacy; information; internet; quality

DOI:  https://doi.org/10.63116/SVXO9846
Eur J Dent Educ. 2026 Feb 23.

The Dental Misinformation Epidemic: An Evaluation of Quality, Accuracy and Educational Impact in Youtube's Tooth Bleaching Content.

Letícia de Azevedo Lima Silva, Hugo Juan da Silva Rolim, Gabriella Rodrigues da Silva, Pedro Henrique Sette-de-Souza, Moan Jéfter Fernandes Costa.

   INTRODUCTION: The proliferation of inaccurate information about tooth bleaching on the internet poses a growing risk to oral health, promoting potentially harmful practices while undermining evidence-based professional guidance.
OBJECTIVE: Evaluate the quality, reliability, and educational impact of tooth bleaching videos on YouTube, specifically examining disparities between viewer engagement and scientific rigour.
METHODOLOGY: An analysis was conducted on videos selected from the most searched terms on Google Trends. The assessment used the modified DISCERN, Global Quality Score (GQS) for overall quality, standardized audiovisual criteria and engagement metrics, analysed by two independent researchers.
RESULTS: 71.4% of videos demonstrated low reliability (scores ≤ 2/5), while only 4.7% achieved maximum quality (5/5 GQS). Non-professional content garnered 3.5 × greater engagement than videos produced by health professionals despite 30.2% scoring minimally for reliability (1/5). Audiovisual production quality was prioritized over accuracy, with 100% of videos rated as having high/moderate technical standards despite frequent misinformation.
DISCUSSION: YouTube's. algorithms disproportionately amplify sensationalized, unverified content over evidence-based professional guidance. This promote clinical risks like predominant promotion of unproven methods (e.g., charcoal, lemon) may lead to enamel damage and hypersensitivity.
CONCLUSION: It's urgent a call to action for dental professionals to actively create and disseminate scientifically validated digital content. Healthcare platforms and regulatory bodies must collaborate to develop strategies that prioritize quality and accuracy in dental health communication.

Keywords:  aesthetics; dental; education; health communication; tooth bleaching

DOI:  https://doi.org/10.1111/eje.70114
Medicine (Baltimore). 2026 Feb 27. 105(9): e47925

YouTube as a digital health platform: A cross-sectional thematic and sentiment analysis of the 100 most-viewed knee osteoarthritis videos.

Nurmuhammet Taş, Yakup Erden, Mustafa Hüseyin Temel, Fatih Bağcier.

  Knee osteoarthritis (OA) is a prevalent source of pain and disability. As YouTube becomes increasingly used for health information, the nature and impact of OA-related content warrant systematic evaluation. Using the YouTube Data Application Programming Interface, 327 videos were initially retrieved with the terms "knee osteoarthritis," "gonarthrosis," and "degenerative joint disease." After applying exclusion criteria, the 100 most-viewed English-language videos were selected. Metadata and 60,888 user comments were extracted. Videos were thematically categorized into 6 domains: treatment, disease education, symptoms, diagnosis, patient experience, and uncategorized. Sentiment analysis was performed using the TextBlob library, and statistical trends were assessed via Statistical Package for the Social Sciences (SPSS) 29.0. Treatment was the dominant theme (38%), with nonsurgical options like exercise and rehabilitation comprising over half of these videos (55.3%). Educational (24%) and symptom-related content (12%) were also frequent. Viewer engagement peaked in 2022, with the highest number of views and comments. Over time, video uploads and comment activity increased, best modeled by quadratic trends. Seasonal variation in engagement was not significant. Commonly used words included "exercise," "knee," "thank," and "pain." Sentiment analysis revealed a decline in positive comments and a rise in neutral sentiment, while negative sentiment remained stable. YouTube serves as a growing platform for knee OA information, emphasizing conservative treatments. Although engagement and content volume have increased, sentiment has shifted toward neutrality. Given the variability in content quality, healthcare providers should guide patients toward credible, evidence-based resources to optimize digital health literacy.

Keywords:  YouTube health information; digital health platforms; knee osteoarthritis videos; sentiment analysis YouTube; social media in healthcare

DOI:  https://doi.org/10.1097/MD.0000000000047925
Nutrients. 2026 Feb 13. pii: 619. [Epub ahead of print]18(4):

Sarcopenia and Nutrition on YouTube: A Content Quality and Reliability Assessment.

Carmen Trost, Richard Crevenna, Jacob Heisinger, Domenik Popp, Annemarie Perl, Eva-Maria Marchard, Stephan Heisinger.

   BACKGROUND/OBJECTIVES: Sarcopenia is a prevalent age-related condition strongly influenced by protein intake. This study assessed the quality, evidence base, and practical utility of YouTube videos on nutrition and sarcopenia.
METHODS: A structured YouTube search (We searched YouTube in April 2024 using the term 'sarcopenia diet) identified 41 eligible videos. Three trained raters independently assessed each video using the Global Quality Scale (GQS), DISCERN, JAMA benchmarks, subjective impression ratings, technical quality indicators, and additional binary variables. Interrater reliability was examined using intraclass correlation coefficients (ICCs) and Fleiss' Kappa. Reviewer comments were analyzed qualitatively using Mayring's structured content analysis.
RESULTS: Overall video quality varied substantially. ICCs indicated moderate to high agreement for DISCERN (0.698), JAMA (0.702), and subjective impression ratings (0.779), but minimal agreement for sound and video quality. Fleiss' Kappa showed moderate agreement for scientific soundness (κ = 0.522) and advertisement content (κ = 0.385), while agreement was low for health-related risks and dietary recommendations. Qualitative analysis identified frequent concerns regarding insufficient scientific evidence, vague or impractical protein guidance, limited relevance for older adults, and personal bias; positive features were less common.
CONCLUSIONS: YouTube nutrition content on sarcopenia shows substantial variability and frequent deficits in evidence-based quality and practical relevance. While some videos provide useful introductory information, many are of limited value for lay audiences. Strengthening digital health literacy and expanding expert-driven, evidence-based online resources are essential to support informed decision-making and preventive strategies.

Keywords:  YouTube; digital health; health literacy; nutrition; older adults; sarcopenia

DOI:  https://doi.org/10.3390/nu18040619
Int J Dent. 2026 ;2026 1509354

Evaluation of Popularity, Reliability, and Quality of Dental and Oral Microbiology Videos on YouTube as Sources of Dental Education.

Ahmed Hashim, Anil Bangalore Shivappa.

   Objectives: The expansion of teaching resources in dental education has changed significantly because of the growth of digital video platforms. However, questions remain about their reliability and quality. This study aims to evaluate the popularity, reliability, and quality of YouTube videos on dental and oral microbiology.
Materials and Methods: Two search phrases were used to identify 200 videos, and the video power index (VPI) was utilized to evaluate popularity based on views, likes, and dislikes. The GQS scoring system was used to measure the videos' quality, and JAMA and modified DISCERN tools were used to assess their reliability.
Results: Analyzed videos were classified into four categories: poor, moderate, good, and excellent. VPI of the 61 videos analyzed was moderate (0.06), and JAMA and mDISCERN both achieved low excellent reliability ratings (9.8% and 3.3%), respectively. However, only 6.6% of the videos received a GQS quality rating of excellent.
Conclusion: YouTube algorithms favor shorter content for student engagement and influence the video's popularity. The absence of stringent standards for reliability and quality during production led to the conclusion that a significant proportion of videos was educationally ineffective. Thus, it is essential that educators carefully plan, write reliably, and produce high-quality videos.

Keywords:  YouTube; dental microbiology; oral microbiology; popularity; quality; reliability

DOI:  https://doi.org/10.1155/ijod/1509354
Medicine (Baltimore). 2026 Feb 20. 105(8): e47786

Evaluation of the usability of YouTube videos as educational materials in improving patients' metered dose inhaler use skills: A cross-sectional study.

Canan Arslan, Hicran Yildiz.

  This study aimed to analyze the quality, reliability, and content of YouTube videos related to the use technique of metered dose inhalers (MDIs). Considering the increasing use of digital platforms for health education, it is important to evaluate the accuracy and educational value of such online resources. This descriptive and cross-sectional study evaluated 377 YouTube videos retrieved through a systematic search. The video selection process was monitored using the PRISMA flow diagram according to predefined inclusion and exclusion criteria. The videos' content, quality, and reliability were assessed using the Modified DISCERN tool, Global Quality Scale, Patient Education Materials Assessment Tool, and a checklist containing MDI use steps. Statistical analyses were conducted to determine associations between evaluation scores and video characteristics. A total of 12 YouTube videos on MDI use were analyzed. Of these, 33.3% were uploaded in 2022, and 66.7% were produced by associations or unions in the health sector. The mean interaction index was 0.53 ± 0.32, and the Video Power Index was 0.08 ± 0.15. The mean reliability level of the videos was 2.91 ± 0.75, quality level 4.16 ± 0.76, understandability level 80.83 ± 18.75, actionability level 81.66 ± 18.16, information level 77.00 ± 15.06, and information accuracy level 76.25 ± 15.80. YouTube videos on MDI use were largely adequate in terms of content but demonstrated some deficiencies in reliability. It is recommended that digital health information be produced under professional supervision and in collaboration with health authorities to ensure accuracy and trustworthiness.

Keywords:  educational video; inhaler; material quality; metered dose inhaler; patient education

DOI:  https://doi.org/10.1097/MD.0000000000047786
Digit Health. 2026 Jan-Dec;12:12 20552076261426335

Assessing the content and quality of gout-related videos on video platforms: A cross-sectional study.

Xiangsheng Ye, Hao Sun, Wenze Jiang, Yuqing Zhu.

   Background: The increasing global incidence of gout has heightened public interest in accessible health information. Video-sharing platforms such as Bilibili, Douyin, and YouTube have become major sources of gout-related content, yet the quality and reliability of these videos remain insufficiently evaluated.
Methods: On September 1, 2025, approximately 100 top gout-related videos were retrieved from each platform. Video features and interaction metrics were recorded. Quality and reliability were appraised using the Global Quality Score (GQS), modified DISCERN (mDISCERN), Patient Education Materials Assessment Tool (PEMAT), and Video Information and Quality Index (VIQI). Spearman correlation analyses explored relationships between video characteristics and quality scores.
Results: Overall quality was moderate (GQS: 3.00; mDISCERN: 2.00; PEMAT-T: 75.00; PEMAT-U: 83.33; PEMAT-A: 75.00; VIQI: 15.00; respectively). Douyin videos were significantly shorter with higher interaction metrics but limited depth, while YouTube and Bilibili videos were longer and covered broader topics. YouTube excelled in understandability but lagged in actionability, whereas Douyin performed best in VIQI. Healthcare professional-produced videos significantly surpassed non-professional counterparts across most quality metrics. Correlation analysis confirmed the consistency of these tools in reliability assessment. Meanwhile, positive correlations were observed between video length, view count, and quality scores.
Conclusion: This study reveals that gout-related videos on major platforms exhibit moderate quality and reliability. Professional uploads demonstrated superior thematic diversity and quality compared to non-professionals. Platform-specific differences were observed: Douyin prioritized interactivity over depth, whereas YouTube and Bilibili offered broader topic coverage. Positive correlations between duration, views, and quality imply that extended, evidence-based content fosters greater engagement.

Keywords:  Bilibili; Douyin; Gout; YouTube; public health; quality assessment tools; social media

DOI:  https://doi.org/10.1177/20552076261426335
PeerJ. 2026 ;14 e20828

Quality, reliability and popularity assessment of Turkish YouTube videos on human papillomavirus and its vaccine: content analysis.

Hakan Gülmez, Merve Ciftci.

   Background: Human papillomavirus (HPV) is among the most common sexually transmitted viral infections and is associated with significant health burdens, including genital warts and various cancers. YouTube has emerged as a frequently used platform for accessing health-related information, yet the quality and reliability of such content remain uncertain. Individuals seeking information about HPV and its vaccine frequently turn to YouTube, underscoring the need for systematic evaluation of online video content related to HPV vaccination, given its potential to directly influence public engagement and vaccine acceptance. Analyzing YouTube videos provides insight into the level of information users are exposed to, the potential risks of misinformation, and the overall reliability of health-related digital content. Such evaluations contribute to strengthening public health strategies and support efforts to enhance access to accurate and trustworthy information.
Objective: This study aimed to evaluate the information quality, reliability, and popularity of Turkish-language YouTube videos related to HPV and the HPV vaccine.
Methods: This cross-sectional study was conducted on the YouTube platform using the keywords 'HPV,' 'HPV vaccine ("HPV aşısı"),' and 'human papillomavirus,' and the videos were viewed between July 1 and 31, 2025. The first 200 videos for each keyword were screened, and those meeting the inclusion criteria were analyzed. Video characteristics (e.g., duration, views, likes, comments, like ratio) and content features (e.g., type, source, presentation format, narrator, purpose, citation of sources, recency, vaccine recommendations, anti-vaccine stance) were recorded. Information quality was assessed using the DISCERN instrument (16-80), reliability using the Journal of the American Medical Association (JAMA) Benchmark criteria (0-4), and popularity using the Video Power Index (VPI). Statistical analyses included descriptive statistics, t-tests, ANOVA, chi-square tests, and Pearson correlation analysis, with P < .05 considered significant.
Results: A total of 600 videos were analyzed, with 270 of them meeting the inclusion criteria. The mean duration was 5.62 ± 13.98 minutes (range: 0.12-147), and the mean number of views was 35,829.5 ± 156,796.8 (range: 131-2,260,592). The mean DISCERN, JAMA, and VPI scores of the videos were calculated as 47.6 ± 15.1 (range: 16-80), 1.9 ± 0.9 (range: 0-4), and 82.9 ± 318.6 (range: 0-3,980), respectively. More than half of the videoswere prepared by healthcare professionals (52.9%, n = 143). When categorized by content, a relatively large proportion of the videos focused on vaccination (40.4%, n = 109). A substantial proportion of the videos (58.5%, n = 158) explicitly recommended the HPV vaccine, while only a small proportion (1.9%, n = 5) expressed an anti-vaccine stance. As videos are presented by healthcare professionals and their duration increases, the quality of information improves; however, this quality does not appear to be directly associated with popularity.
Conclusions: Currently, YouTube has become a frequently utilized platform for sharing health information. However, a significant portion of the analyzed HPV-related content is inadequate in terms of information quality and reliability. Promoting longer-duration, current and evidence-based videos prepared by healthcare professionals that cite reliable sources may contribute to the improvement of digital health literacy in the general population.

Keywords:  DISCERN; HPV; HPV vaccine; Health literacy; Human papillomavirus; JAMA; VPI; YouTube

DOI:  https://doi.org/10.7717/peerj.20828
J Pediatr Orthop B. 2026 Feb 25.

Intoeing gait in children: one of the most common pediatric orthopedic complaints and an evaluation of the quality of related YouTube videos.

Tahsin Olgun Bayraktar, Nazim Erkurt, Ali Yüce, Mustafa Yerli, Serdar Aki, Mehmet Selçuk Saygili.

  Intoeing is a common reason for pediatric orthopedic consultations. Families increasingly use YouTube for medical information, but the reliability and quality of this content are unclear, and no previous study has evaluated videos on intoeing. This study assessed the reliability, educational quality, and popularity of YouTube videos on intoeing using validated scoring systems and a novel disease-specific tool. YouTube was searched using the terms 'intoeing', 'pigeon toe', and 'toeing in'. After applying inclusion and exclusion criteria, 48 videos were analyzed. Video characteristics were recorded, and reliability and quality were evaluated using the Journal of the American Medical Association score, Global Quality Score, DISCERN instrument, and the Intoeing Specific Score (ISS) developed for this study. Popularity was measured using the Video Power Index. Interobserver and intraobserver reliability were calculated, and statistical analyses examined associations between scores, video sources, and content. Overall quality was low: 76.4% of videos scored less than or equal to 2 on the Journal of the American Medical Association score, and 54% were rated poor or very poor by DISCERN. According to the ISS, 43.8% were very poor. Academic and physician-generated videos had higher educational quality but lower popularity than nonprofessional sources. Videos from YouTube-verified uploaders scored significantly higher in all quality measures, yet popularity did not correlate with educational quality. YouTube videos on intoeing are generally low quality, revealing a gap between popularity and reliability. Although academic and physician-generated content is more accurate, it is less represented among popular videos. The ISS showed strong reliability and may be useful for future evaluations of disease-specific online content.

Keywords:  YouTube; health information quality; intoeing; pigeon toe; video analysis

DOI:  https://doi.org/10.1097/BPB.0000000000001341
J Ophthalmol. 2026 ;2026 8987000

Evaluation of Social Media Short-Form Video Content for Patient Education on Vision-Threatening Diseases.

Riya H Patel, David Mothy, Neha Boinpally, Hassaam S Choudhry, Puja Bhavsar, Albert S Khouri.

Purpose: The rapid spread of unreliable misinformation on vision-threatening diseases can significantly affect the eye health behaviors and outcomes of the patients consuming short-form social media content. This study evaluates short-form videos pertaining to vision-threatening diseases to quantify video quality, content, and popularity.
Methods: This cross-sectional study analyzed short-form videos on cataracts, diabetic retinopathy, glaucoma, and age-related macular degeneration from TikTok, Instagram Reels, and YouTube Shorts. A hashtag search identified the first fifty videos on each disease from each social media platform. Two reviewers evaluated them, resolving discrepancies with a third. Outcome measures included number of views, likes, comments, uploader source, content type, modified DISCERN score (0-5 scale), and global quality scale (GQS) score (1-5 scale). Engagement outcomes were summarized descriptively using medians and interquartile ranges, while reliability and quality outcomes were analyzed using one-way ANOVA with Tukey post hoc comparisons.
Results: TikTok videos demonstrated higher median engagement (views, likes, and comments) compared to Instagram Reels and YouTube Shorts. Videos on cataracts had higher engagement statistics compared to the other vision-threatening diseases across all platforms. Physicians were the most common video source (45%). The most common content categories were treatments/management (36%) and general symptoms (22%). YouTube Shorts had a significantly greater average DISCERN (2.93 ± 0.70) and GQS score (3.85 ± 1.26) than Instagram Reels and TikTok (p < 0.001). Videos from patients had the lowest mean DISCERN and GQS scores.
Conclusions: TikTok had the greatest median engagement levels, while YouTube Shorts had the greatest mean quality and reliability. Videos from patients and philanthropists had lower quality scores, while healthcare professionals and organizations had the highest. Future efforts should understand the patients' perspectives, address misinformation, and improve quality across all social media platforms.

DOI: https://doi.org/10.1155/joph/8987000
Sci Diabetes Self Manag Care. 2026 Feb 26. 26350106261422680

Exploring the Quality of TikTok-Based Diabetes Self-Management Education in English and Spanish: A Digital Health Study.

Lisa Diaz, Dante Anthony Tolentino, Mary-Lynn Brecht, Wendie Robbins, Sarah E Choi.

PURPOSE: The purpose of this study was to evaluate TikTok videos about type 2 diabetes mellitus (T2DM) in English and Spanish, with a focus on the Association of Diabetes Care and Education Specialists' 7 self-care behaviors (ADCES7).
METHODS: This descriptive study analyzed 300 TikTok videos via English and Spanish hashtags related to T2DM and categorized self-management education content utilizing the ADCES7 self-care behavior framework. Video content creators were categorized into 3 groups: health care professionals, personal accounts, and companies. The Global Quality Scale (GQS) was used to assess video quality from 1 (poor) to 5 (excellent). User engagement metrics were recorded to examine differences across videos.
RESULTS: The analysis of videos revealed that healthy eating was the most frequently addressed ADCES7 self-care behavior, followed by reducing risk. In contrast, problem-solving and healthy coping were the least represented. Spanish-language videos emphasized healthy coping more than English content. Health care professionals' videos had higher GQS scores than personal accounts and companies. However, overall GQS scores remained low-engagement varied by source, with personal creators generating the highest levels of likes and comments.
CONCLUSIONS: Traditional in-person diabetes education has limited accessibility, especially for various groups, including Hispanic communities. This study found that TikTok content often lacked high-quality and comprehensive coverage of ADCES7 self-care behaviors, with personal accounts generating the most engagement despite lower quality. Given TikTok's rapid growth, there is an urgent need to improve the quality of diabetes-management content to ensure users receive accurate and reliable information.

DOI: https://doi.org/10.1177/26350106261422680
JMIR Form Res. 2026 Feb 24. 10 e79978

TikTok as a Platform for Patient Education and Health Information in Rare Genetic Diseases: Cross-Sectional Study.

Jackson Montgomery Wahman, Rhoda Mariam Hijazi, Hosam Gharib Abdelhady.

   Background: Rare genetic diseases pose significant diagnostic and therapeutic challenges, often leading to delayed diagnoses, misinformation, and patient isolation. Social media platforms have emerged as prominent spaces for health information dissemination and community building among patients with rare diseases.
Objective: This study aimed to evaluate the role of TikTok videos in patient education, community engagement, and information quality related to 5 rare genetic conditions: Ehlers-Danlos syndrome, Marfan syndrome, cystic fibrosis, Wilson disease, and Gaucher disease.
Methods: A cross-sectional analysis was conducted on 184 TikTok videos identified via disease-specific hashtags. Included videos were 15 seconds to 4 minutes long and directly discussed the target diseases. Advertisements, promotional content, and product marketing were excluded. Videos were categorized by creator type: physicians, medical professionals, patients, influencers, nonprofit organizations, and others. Content quality was assessed using the Global Quality Scale (GQS) and a modified DISCERN tool (mDISCERN). Engagement metrics (views, likes, and shares) were recorded. Kruskal-Wallis and chi-square tests evaluated differences across creator categories.
Results: Of the 184 TikTok videos, 88 (47.8%) were created by patients or family members; 31 (16.8%) by influencers, 24 (13.0%) by physicians, 17 (9.2%) by nonprofit organizations, 15 (8.2%) by general users, and 9 (4.9%) by others. Collectively, the videos amassed more than 123 million views. Influencer-generated content accounted for the highest cumulative view count, totaling approximately 60.9 million views. Content produced by medical professionals and physicians demonstrated higher information quality, with mean GQS scores of 3.89 (SD 0.66) and 3.62 (SD 0.71) and mDISCERN scores of 3.11 (SD 0.58) and 3.21 (SD 0.65), respectively. In contrast, videos by influencers and patients exhibited lower quality scores (influencers: GQS mean 1.48, SD 0.60; mDISCERN mean 1.42, SD 0.55; patients: GQS mean 1.57, SD 0.58; mDISCERN mean 1.38, SD 0.52). For Ehlers-Danlos syndrome (n=40 videos, 21.7%), Wilson disease (n=40 videos, 21.7%), and cystic fibrosis (n=34 videos, 18.5%), significant differences in quality scores among creator types were observed (P<.001, P<.001, and P≤.04, respectively). For Marfan syndrome (n=40 videos, 21.7%) and Gaucher disease (n=30 videos, 16.3%), no significant differences were observed (P=.43 and P=.07, respectively). Chi-square analysis indicated no association between creator type and inclusion of peer-reviewed references (χ25=10.6; P=.07). Overall, only 7 (3.8%) videos cited scientific literature.
Conclusions: TikTok serves as a key platform for rare disease awareness and community engagement, although the quality and accuracy of health information vary widely. Although medical professionals produced higher-quality content, it tended to receive less visibility. Increasing the presence of health care professionals and improving visibility of evidence-based content could enhance patient education and safer health information sharing.

Keywords:  Ehlers-Danlos syndrome; Gaucher disease; Marfan syndrome; TikTok; cystic fibrosis; health misinformation; patient education; rare diseases; social media

DOI:  https://doi.org/10.2196/79978
J Paediatr Child Health. 2026 Feb 23.

Cracking the Code: Adolescent Insights on TikTok Health Information Videos Produced by University Health Students.

Stephanie Brown, Sarah Bush, Amy Gray, Carolyn Van Heerden, Tomas Arvanitis, Loredana Marchione, Wonie Uahwatanasakul.

   AIM: Use of social media platforms such as TikTok within the adolescent population is widespread. Harnessing its accessibility and prevalence provides health professionals an opportunity to disseminate positive, evidence-based health information. However, infiltrating this domain brings challenges such as countering abundant misinformation and understanding the target audience. Creating successful short-form videos for social media is a nuanced skill. University health students produced videos focusing on common adolescent issues for a health-promoting TikTok channel. The aim of this study was to explore secondary school student perceptions of these videos.
METHODS: A mixed methods evaluation was undertaken using surveys and focus group interviews. Teachers from participating secondary schools recruited parents and students via the school online communication system. Descriptive statistics from survey responses were used to analyse demographics and scale responses. A uses and gratification lens was used for inductive content analysis of qualitative data.
RESULTS: Participant students were predominantly from Year 9 (age 14-15 years), 161/212 (76%). The median score for enjoyment and positive learnings from videos was 5/10 and for likelihood of sharing videos was 3/10. Three themes emerged from the qualitative data: (1) mixed perceptions of video content with a preference for health not disease, (2) engagement driven by entertainment and 3) social media platforms for enjoyment versus education.
CONCLUSIONS: Social media platforms are an accessible source of health information for adolescents. Health professionals have an opportunity to provide evidence-based health information and combat misinformation. Creating effective and targeted video content can increase the positive impact on adolescent audiences.

Keywords:  adolescent; communication; consumer health information; engagement; social media

DOI:  https://doi.org/10.1111/jpc.70334
Digit Health. 2026 Jan-Dec;12:12 20552076261425515

Assessment of the most popular short videos about hemorrhoids on Chinese TikTok: A cross-sectional study.

Mingqiu Lu, Danna Shen, Liye Zhuang, Jiazi Yu.

   Background: Social media like TikTok serve as major channels for sharing health information on hemorrhoids. However, little is known about the quality, reliability, and impact of this content on user decisions.
Objective: Our study aims to evaluate the quality and reliability of hemorrhoid-related videos on Chinese TikTok.
Methods: On August 7, 2025, we systematically searched Chinese TikTok using relevant keywords and identified the 192 most-liked hemorrhoid-related videos. Two independent reviewers assessed each video using the Global Quality Score (GQS), the modified DISCERN (mDISCERN) score, and the Journal of the American Medical Association (JAMA) criteria. User engagement metrics (total likes, comments, saves, and shares), video length, and upload date were recorded. Spearman correlation analysis was performed to explore the association between quality scores and engagement metrics.
Results: Overall, 87 videos were included and accumulated 6,599,153 likes. Of these, 25.29% (n = 22) were classified as low quality, 40.23% (n = 35) as moderate quality, and 34.48% (n = 30) as high quality. Most videos (74.71%, n = 65) were uploaded by certified physicians, who contributed the majority of high-quality content (93.34%), whereas low-quality content mainly originated from general users (54.54%). Videos uploaded by certified physicians demonstrated significantly higher quality and reliability than those from non-physicians (p < 0.0001). Significant differences were observed across quality groups in video length and daily averages of likes, comments, and shares (p < 0.001, p = 0.031, p < 0.001, and p = 0.022, respectively). High-quality videos were the longest, while low-quality videos received the highest engagement. Moreover, video length showed a significantly positive correlation with both the GQS (r = 0.56, p < 0.001) and the mDISCERN score (r = 0.37, p < 0.001).
Conclusion: The quality and reliability of hemorrhoid-related videos on Chinese TikTok are moderate. To improve health information quality, TikTok could consider extending video length as appropriate, strengthening content moderation, and encouraging creators to reference authoritative sources.

Keywords:  Hemorrhoids; TikTok; health information; reliability; short videos; social media; video quality

DOI:  https://doi.org/10.1177/20552076261425515
Wideochir Inne Tech Maloinwazyjne. 2025 Dec 29. 20(4): 456-463

Assessment of the educational value of robotic‑assisted thoracoscopic pulmonary lobectomy videos.

Nilay Çavuşoğlu Yalçın, Ayşegül Güler, Muharrem Özkaya.

   INTRODUCTION: Robotic-assisted thoracic surgery (RATS) is increasingly used in lung cancer treatment. As surgical education progressively shifts to online platforms, such as YouTube, concerns have emerged regarding the reliability of available content. This study evaluated the educational quality of the most-viewed RATS lobectomy videos on YouTube, using the Laparoscopic Surgery Video Educational Guidelines (LAP-VEGaS) and the Critical View of Safety (CVS) criteria.
AIM: We aimed to evaluate the educational quality of widely viewed YouTube videos on RATS using the LAP-VEGaS assessment and CVS criteria tools.
MATERIALS AND METHODS: A YouTube search was performed using a key word "robotic lobectomy." A total of 25 videos with more than 5000 views that met the inclusion criteria were evaluated in terms of video characteristics and educational quality. The assessment was performed using the CVS and LAP-VEGaS criteria, and statistical analyses were conducted to explore correlations between video features and the scores.
RESULTS: A total of 25 videos met the inclusion criteria. Right upper lobectomy was the most frequently demonstrated procedure. Median view count was 7157 (6001-14 152), with significant correlations between views and likes, as well as duration online. Overall educational quality was limited, with median CVS compliance of 50% (50%-62.5%) and a median LAP-VEGaS score of (4-10.5).
CONCLUSIONS: The educational quality of robotic-assisted lobectomy videos on YouTube is heterogeneous and generally suboptimal. Peer-reviewed and standardized video archives curated by academic institutions or professional societies are needed to ensure reliable resources for robotic surgery training.

Keywords:  Critical View of Safety; Laparoscopic Surgery Video Educational Guidelines; robotic ‑assisted thoracic surgery; surgical education; video assessment

DOI:  https://doi.org/10.20452/wiitm.2025.17994
PLoS One. 2026 ;21(2): e0342690

Health information-seeking behavior among women with polycystic ovary syndrome: A scoping review protocol.

Xia Chen, Xiaoping Xie, E Tan, Yingni Liang, Fen Liu, Zhongyu Li, Yinhua Su.

BACKGROUND: Polycystic ovary syndrome (PCOS) is the most common endocrine metabolic disorder among women of childbearing age, and self-management of PCOS patients relies on their ability to obtain health information. The proliferation of digital technologies, particularly social media and health applications, has fundamentally transformed health information-seeking behaviors (HISB) in this population. However, the present information behavior patterns of PCOS patients have not yet been systematically integrated.
OBJECTIVES: This scoping review aims to systematically map the landscape of HISB in women with PCOS by utilizing Wilson's model of information-seeking behavior as theoretical framework. It seeks to synthesize evidence on their information needs, preferred channels, behavioral types, and key influencing factors.
METHODS: The scoping review will adhere to Arksey and O'Malley's methodological framework and report following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The review will include English-language literature published from inception up to November 30, 2025, searched through PubMed, Web of Science, Embase (Ovid), CINAHL, Cochrane Library, and APA PsycINFO. To find more relevant studies, we will also search grey literature, the reference lists of the included studies, and related systematic reviews. Two researchers will independently screen titles/abstracts, followed by full-text articles, to assess whether articles meet the inclusion criteria. A third researcher will resolve any discrepancies. Data extraction and narrative synthesis will be structured around the core constructs of Wilson's model, providing a theory-informed analysis of the evidence.
ETHICS AND DISSEMINATION: Since this review involves collecting data from existing literature and does not involve human participants, ethical approval is not required. This scoping review will be submitted for publication to a peer reviewed academic journal.

DOI: https://doi.org/10.1371/journal.pone.0342690
Future Oncol. 2026 Feb 27. 1-7

Helping people navigate and make sense of cancer information: challenges and perspectives.

Lisa Rezende, Gissoo DeCotiis, Eric K Singhi.

  People living with cancer, and their caregivers and families, have access to many sources of health information. This brings both positives and challenges. Some information is not easy to understand and it can be difficult to know what is relevant to an individual's situation. It can also be difficult to know what sources are trustworthy, how to check accuracy, and what might be false information (misinformation or disinformation). This podcast brings together the perspectives of a medical oncologist, a patient advocate, and a recipient of risk-mitigating cancer care who also brings experience of caregiving and volunteering for an organization that provides information to support those at risk of developing cancer. They discuss various approaches to help people find relevant and trustworthy health information, and consider how healthcare providers can support this. An accompanying infographic provides a patient's guide to cancer-focused medical journals, which can be shared with people with cancer and their caregivers to help open the door to the patient-friendly content that is sometimes published by these journals. Overall, people are encouraged to seek support from their healthcare providers to find and interpret cancer information, with consideration of relevance to their own situation and the validity of the source.

Keywords:  Cancer; health information; health literacy; misinformation; patients; social media

DOI:  https://doi.org/10.1080/14796694.2026.2628969
AMIA Annu Symp Proc. 2024 ;2024 376-382

When Patients Go to "Dr. Google" Before They Go to the Emergency Department.

Michael A Grasso, Alexandra Rogalski, Naveed Farrukh, Anantaa Kotal, Enrique Calleros.

Approximately one-third of adults search the internet for health information before visiting an emergency department (ED), with 75% encountering inaccurate content. This study examineshow such searches influence patient care. We conducted an observational study of ED visits overa 12-month period, surveying 214 of 576 patients about pre-ED internet use. Data on demographics, comorbidities, acuity, orders, prescriptions, and dispositions were extracted. Patients who searched were typically younger, healthier, and more educated. Most used a general search engine to ask symptom-related questions. Compared to non-searchers, they were less likely to receive lab tests (RR 0.78, p=0.053), imaging(RR 0.75, p=0.094), medications (RR 0.67, p=0.038), oradmission (RR 0.68, p=0.175). They were more likely to leave against medical advice (RR 1.67, p=0.067) and receive opioids (RR 1.56, p=0.151). Findings suggest inaccurate health information may contribute to mismatched expectations and alter ed care delivery.
Digit Health. 2026 Jan-Dec;12:12 20552076261427777

Digital access and health outcomes: The moderating role of socioeconomic status in health information seeking.

Yingying Cai.

   Objective: Digital health disparities represent a growing equity concern in an era of increasing reliance on digital platforms for health information, yet unequal access and utilization patterns may disadvantage vulnerable populations. This study investigates how socioeconomic inequalities shape the relationship between digital access and health outcomes, specifically examining the mediating role of health information seeking and the moderating effect of socioeconomic status (SES) on the pathway from digital access to health information seeking.
Methods: This study analyzed cross-sectional data from the 2021 China General Social Survey (N = 2265 after applying inclusion criteria). A composite digital access index and a health information seeking index were constructed, with self-rated health as the primary outcome measure. Hierarchical multiple regression examined the association between digital access and health outcomes, controlling for SES, gender, age, and residence. Bootstrap mediation analysis (5000 replications) tested the mediating role of health information seeking, and moderation analysis examined how SES influenced the pathway from digital access to health information seeking. All analyses were performed using Stata 18.
Results: Digital access significantly predicted better health outcomes. However, health information seeking did not mediate this relationship when SES was controlled. Importantly, SES emerged as a moderator of the association between digital access and health information seeking, revealing that digital access yielded greater information-seeking benefits for higher SES individuals.
Conclusion: Digital access improves health outcomes, but not through health information seeking as commonly assumed. SES moderates the link between digital access and health information seeking, with benefits concentrated among advantaged groups, potentially exacerbating health inequalities. Effective digital health policies need address not only technology access but also the socioeconomic barriers that prevent disadvantaged populations from translating digital access into health information seeking and improved health outcomes.

Keywords:  Digital access; digital health disparities; health information seeking; health outcomes; socioeconomic status

DOI:  https://doi.org/10.1177/20552076261427777
Contemp Nurse. 2026 Feb 25. 1-10

From scrubs to screens: A hashtag analysis study of #Nursetok on TikTok.

Firdous M Usman, Meisya Rosamystica, Zara Arshad, Faisal A Nawaz, Rahul Kashyap.

  Background: TikTok has become a popular platform for healthcare communication, with #Nursetok enables nurses to share experiences, education, and professional insights. Research on the content, engagement, and educational value of these videos remains limited.Objective: To analyze #Nursetok video content, creator demographics, engagement metrics, and information quality, assessing the role of TikTok in nursing practice and education.Methods: On April 30th, 2024, 100 English-language nursing-related TikTok videos under #Nursetok were selected. Metadata included publisher type, gender, views, likes, comments, shares, and saves. Videos were evaluated for understandability and actionability using PEMAT-A/V and quality using DISCERN. Each video was independently reviewed by two healthcare professionals, with a third resolving discrepancies.Results: Most videos (96%) were created by registered nurses; 68% by females. Content was primarily entertainment (45%) and healthcare professional perspectives (29%), with educational content comprising 10%. Average video length was 0.58 min, with mean views of 5.1 million, 533,803 likes, 2,632 comments, 21,707 shares, and 26,715 saves. PEMAT-A/V scores indicated moderate understandability (57%) but low actionability (13%). DISCERN ratings averaged 23.6%, reflecting moderate to poor quality.Conclusions: #Nursetok is predominantly nurse-driven and entertainment-focused, with limited actionable or educational content. Future directions include diversifying contributors, incorporating interprofessional perspectives and developing evidence-based, simulation or case-based content to enhance its value for professional development in nursing.

Keywords:  TikTok; healthcare communication; nursing education; social media; video content analysis

DOI:  https://doi.org/10.1080/10376178.2026.2635714
Rev Esc Enferm USP. 2026 ;pii: S0080-62342026000100414. [Epub ahead of print]60 e20250253

Online health information seeking behaviors and self-reported health among Chinese university students: associations and sex differences.

Jie Chen, Hua Tian.

OBJECTIVES: Aim to explore the associations and sex differences between online health information seeking behaviours (OHISBs) and self-reported health among university students.
METHODS: Cross-sectional, analytical study carried out in the 5,408 university students from Xinyang Normal University, who responded questionnaire in Wenjunxing as valid participants. t-tests and binary logistic regression analysis were performed using SPSS.
RESULTS: Sex differences in OHISBs were found. Females most liked food nutrition and diet-related OHI (p < 0.001), while males most liked physical exercise OHI (p < 0.001). For specific OHISBs, finding information about hospitals or doctors (OR = 1.785, [CI95% = 1.212-2.628], p < 0.01), and online reservation of health care projects (OR = 2.491, [CI95% = 1.056-5.876], p < 0.05) had significant impacts on females' self-reported health, and finding information about hospitals or doctors (OR = 2.171, [CI95% = 1.035-4.551], p < 0.05) had a significant impact on males' self-reported health.
CONCLUSION: Findings of sex-differentiated online health seeking and the benefits of proactive search indicate that effective digital interventions for students must combine gender-specific strategies with behavioral empowerment.

DOI: https://doi.org/10.1590/1980-220X-REEUSP-2025-0253en