bims-librar Biomed News
on Biomedical librarianship
Issue of 2026–02–01
35 papers selected by
Thomas Krichel, Open Library Society



  1. Med Ref Serv Q. 2026 Jan 29. 1-8
      PsycInfo is a foundational database for evidence-based behavioral health literature, which makes it an essential tool for health sciences librarians. This column provides a comprehensive comparison of PsycInfo on Ovid and EBSCO platforms. It offers a foundational overview for new librarians, guiding them through essential functionalities and navigation. The article also helps librarians evaluate potential database acquisitions by highlighting the distinct features and search capabilities of each interface. This comparison aims to equip health sciences librarians with the knowledge to optimize search strategies and inform strategic purchasing decisions.
    Keywords:  Database comparison; EBSCO; MEDLINE; Ovid; PsycInfo; evidence-based practice; information retrieval; search strategies; user experience
    DOI:  https://doi.org/10.1080/02763869.2026.2619781
  2. Front Psychol. 2025 ;16 1729654
      [This corrects the article DOI: 10.3389/fpsyg.2025.1642381.].
    Keywords:  PIR sensors; academic library; explainable AI; machine learning; occupancy monitoring; seat preference; spatial behavior; user comfort
    DOI:  https://doi.org/10.3389/fpsyg.2025.1729654
  3. Anesth Analg. 2026 Jan 29.
       BACKGROUND: Advances in artificial intelligence (AI) have enabled large language models (LLMs) to generate complex and contextually relevant medical responses. However, their potential in clinical decision support for anesthesiology remains underexplored. This study evaluated the accuracy and clinical relevance of high-performing LLMs in response to anesthesia-related questions and compared their performance with traditional online search methods. Clinician perceptions of AI were also assessed. We hypothesized that top-performing large language models would outperform lower-tier models and traditional internet search tools by generating responses rated as more accurate, complete, and clinically relevant to anesthesiology-focused questions, as measured by higher mean evaluator scores on a 10-point Likert scale.
    METHODS: Ten LLMs: GPT-4o, Claude-Sonnet 3.5, DeepSeek R1, Llama 3.1 Instruct 70B, Gemini 2.0, GPT o1-preview, GPT o1, GPT o3-mini, NOVA Pro, and Mistral. All models were tested using ten common general anesthesia questions developed by TMH and validated by 6 physicians. Two Google search conditions served as baselines: a default search conducted in a cleared browser (unpersonalized), and a personalized Google Snippet Search performed in a browser regularly used by a clinician. Four board-certified anesthesiologists independently rated each response on a 10-point Likert scale. An ad hoc Physician Perception Questionnaire captured clinicians' use of AI, trust in its output, and reliance on traditional information sources.
    RESULTS: LLM performance varied significantly (F = 5.89, P <.0001). DeepSeek R1 achieved the highest overall score (7.7), whereas Gemini 2.0 Flash recorded the lowest among LLMs (5.2). The Google Snippet Search scored 5.3, the lowest overall. Pairwise Welch's t tests showed that DeepSeek R1 significantly outperformed Llama, o3-mini, and Mistral (P <.001). Survey results indicated limited AI use in clinical practice; clinicians prioritized source credibility and continued to favor traditional resources.
    CONCLUSIONS: Although LLM-generated responses differed in quality, DeepSeek R1 and Claude-Sonnet 3.5 produced answers most consistent with expert clinical judgment. The poor performance of several models, coupled with clinician skepticism, underscores the need for further validation before integrating AI into routine anesthesiology decision support.
    DOI:  https://doi.org/10.1213/ANE.0000000000007864
  4. Healthcare (Basel). 2026 Jan 06. pii: 140. [Epub ahead of print]14(2):
      Background/Objectives: Caregivers of infants with congenital muscular torticollis (CMT) frequently seek information online, although the accuracy, clarity, and safety of web-based content remain variable. As large language models (LLMs) are increasingly used as health information tools, their reliability for caregiver education requires systematic evaluation. This study aimed to assess the reproducibility and quality of ChatGPT-5.1 responses to caregiver-centered questions regarding CMT. Methods: A set of 17 questions was developed through a Delphi process involving clinicians and caregivers to ensure relevance and comprehensiveness. ChatGPT generated responses in two independent sessions. Reproducibility was assessed using TF-IDF cosine similarity and embedding-based semantic similarity. Ten clinical experts evaluated each response for accuracy, readability, safety, and overall quality using a 4-point Likert scale. Results: ChatGPT demonstrated moderate lexical consistency (mean TF-IDF similarity 0.75) and high semantic stability (mean embedding similarity 0.92). Expert ratings indicated moderate to good performance across domains, with mean scores of 3.0 for accuracy, 3.6 for readability, 3.1 for safety, and 3.1 for overall quality. However, several responses exhibited deficiencies, particularly due to omission of key cautions, oversimplification, or insufficient clinical detail. Conclusions: While ChatGPT provides fluent and generally accurate information about CMT, the observed variability across topics underscores the importance of human oversight and content refinement prior to integration into caregiver-facing educational materials.
    Keywords:  ChatGPT; caregiver education; congenital muscular torticollis; health information quality; large language models; pediatric rehabilitation; reproducibility
    DOI:  https://doi.org/10.3390/healthcare14020140
  5. BMC Oral Health. 2026 Jan 26.
      
    Keywords:  clinical decision support; digital health tools; health information quality; readability analysis
    DOI:  https://doi.org/10.1186/s12903-026-07751-7
  6. HSS J. 2026 Jan 21. 15563316251408833
       Background: Avascular necrosis (AVN) of the bone may result in severe pain, and patients with AVN and their families may seek out information about the condition. With the rise of ChatGPT, AVN patients and families may turn to this chatbot with questions.
    Purpose: We sought to explore expert clinicians' perceptions of the quality of ChatGPT's responses to frequently asked parent questions about AVN in children. Secondary aims of this study were to assess provider perceptions of ChatGPT and AVN parental education and to evaluate the readability of ChatGPT responses.
    Methods: We conducted a cross-sectional survey study of 9 pediatric orthopedic surgeons, oncologists, and advanced practice providers with expertise in the clinical management of AVN. Fifteen common questions parents ask about AVN were posed to ChatGPT, preceded by the following prompt: "Please answer the following parent question relating to avascular necrosis. Please give me a response at or below a sixth-grade reading level: [Question]." The answers were evaluated by participants using a 4-point Likert scale. ChatGPT responses were also assessed using the following readability scores: Flesch-Kincaid Grade Level, Gunning Fog index, and Flesch Reading Ease. In addition, the survey included 4 questions developed to gather overall provider perceptions.
    Results: Providers deemed answers to all 15 questions as at least satisfactory, requiring minimal clarification on average. Yet only 3 ChatGPT responses (20%) were at or below a sixth-grade reading level, as prompted. The average Flesch-Kincaid Grade Level was 6.94, and the average Gunning Fog Index was 9.22, suggesting the responses reflect a reading level between approximately seventh grade and early high school. A majority of providers agreed that these responses would be sufficient for most parents (56%) and that the information was at the appropriate reading level (100%).
    Conclusion: The findings of this small survey study suggest that ChatGPT's responses to common parent questions about AVN were satisfactory, requiring minimal clarification. ChatGPT has the potential to serve as a resource for orthopedic patients and family education, though concerns remain.
    Keywords:  ChatGPT; artificial intelligence; avascular necrosis; osteonecrosis
    DOI:  https://doi.org/10.1177/15563316251408833
  7. Arthrosc Sports Med Rehabil. 2025 Oct;7(5): 101229
       Purpose: To compare the quality of large language model (LLM) responses to frequently asked questions regarding hip arthroscopy, assess the incorrect response rate of LLMs, and compare the readability among different LLM outputs.
    Methods: Three LLMs, including OpenAI Chat Generative Pre-Trained Transformer (ChatGPT) 3.5, Microsoft Co-Pilot, and Google Gemini, were each queried with 10 frequently asked questions regarding hip arthroscopy. Two high-volume hip arthroscopists graded the responses on a 4-point Likert scale (1 = excellent, requiring no clarification; 2 = satisfactory, requiring minimal clarification; 3 = satisfactory, requiring moderate clarification; and 4 = unsatisfactory, requiring substantial clarification). Additionally, the 2 graders ranked the responses from the 3 different LLMs for each of the 10 questions on a 3-point Likert scale (1 = best, 2 = intermediate, 3 = worst). Readability was assessed using the Flesch-Kincaid Grade Level and Flesch Reading Ease metrics.
    Results: Commonly used LLMs performed on a similar level of response accuracy and adequacy (mean ± SD: ChatGPT: 3.0 ± 1.0 vs Microsoft: 2.9 ± 1.1 vs Gemini: 2.6 ± 1.1, P = .481). Reviewers had no preference for one LLM's responses over another (mean ± SD: ChatGPT: 2.0 ± 0.8 vs Microsoft: 2.1 ± 0.9 vs Gemini: 2.0 ± 0.8, P = .931). The overall incorrect response rate among LLMs was 20%. ChatGPT responses were at a significantly worse reading level compared to Gemini and Microsoft outputs (Flesch-Kincaid Grade Level mean ± SD: ChatGPT: 11.0 ± 2.2 grade reading level vs Microsoft: 8.6 ± 2.3 vs Gemini: 6.6 ± 2.2, P = .003; Flesch Reading Ease mean ± SD: ChatGPT: 36.6 ± 19.0 vs Microsoft: 57.7 ± 13.3 vs Gemini: 65.0 ± 4.7, P = .001).
    Conclusions: Hip arthroscopists find LLM outputs on patient questions regarding hip arthroscopy satisfactory but requiring moderate clarification and show no preference for one LLM's responses over another. LLMs produce a substantial number of incorrect responses. ChatGPT outputs had a significantly worse reading level compared to those of Microsoft and Gemini.
    Clinical Relevance: This study provides insights into the accuracy and readability of LLM-generated responses to commonly asked questions about hip arthroscopy. As patients increasingly turn to artificial intelligence tools for health information, understanding the quality and potential risks of misinformation becomes essential.
    DOI:  https://doi.org/10.1016/j.asmr.2025.101229
  8. Ann Plast Surg. 2026 Jan 26.
       BACKGROUND: Patients often use Google as a source of quick medical information, although the accuracy and clarity of search results can vary. ChatGPT has emerged as an alternative tool capable of providing conversational and potentially more reliable medical information. This study compares the readability, accuracy, and completeness of responses generated by ChatGPT with those obtained using Google for common patient questions regarding craniosynostosis and cleft palate.
    METHODS: The terms "Craniosynostosis" and "Cleft Palate" were entered into Google, and the top 10 associated questions for each-identified using the "People Also Ask" tool-were recorded. Each question was then entered into both Google and ChatGPT, and the responses from each were recorded. The ease of readability for each response was determined by the Flesch-Kincaid instrument. Blinded reviewers evaluated accuracy and completeness using a 3-point scale (1 = fully incorrect, 2 = partially incorrect, 3 = correct). Reviewer scores were averaged, and comparisons between platforms were evaluated using t tests.
    RESULTS: A total of 20 questions yielded 40 unique responses. For cleft palate queries, Google responses had significantly lower reading levels than ChatGPT (9.95 vs 13.22, P = 0.006). No significant difference in readability was observed for craniosynostosis responses (14.66 vs 14.73, P = 0.467). Across all questions, ChatGPT responses were significantly more complete (2.60 vs 1.86, P < 0.0001) and more accurate (2.78 vs 2.09, P < 0.0001) than Google responses. These differences persisted when each condition was analyzed separately.
    CONCLUSION: ChatGPT provides more accurate and comprehensive information than Google for common patient questions about craniosynostosis without sacrificing readability. Patients can use this information to inform their future searches in order to obtain the most accurate information about their diagnoses. Further studies evaluating the information learned by patients from both search engines can help clinicians guide patients toward resources that best fit their individual care.
    Keywords:  ChatGPT; artificial intelligence; cleft palate; craniosynostosis; patient information
    DOI:  https://doi.org/10.1097/SAP.0000000000004652
  9. J Orofac Orthop. 2026 Jan 30.
       PURPOSE: This study assessed the accuracy and repeatability of the orthodontics-related information generated by Chat Generative Pre-trained Transformer (ChatGPT, model GPT-4o, 18 July 2024, OpenAI, San Francisco, CA, USA) and evaluated its usefulness for patient education by comparing artificial intelligence (AI)-generated responses to questions that patients frequently search for with responses from two orthodontic specialists.
    MATERIALS AND METHODS: In January and February 2025, 30 descriptive questions (15 on basic orthodontics and 15 on clinically advanced orthodontics) on nondecision, informational content in patient education were assessed, including a "briefly summarize within 500 characters" condition. Accuracy was defined as a response match between ChatGPT and the orthodontist, repeatability was the consistency of ChatGPT output over two iterations. Evaluations used a 5-point Likert scale for accuracy and a 5-point Global Quality Score (GQS) for repeatability. Data were analyzed using R, Wilcoxon signed-rank test and the Mann-Whitney U test.
    RESULTS: Repeated responses from ChatGPT showed high repeatability with consistent overall accuracy. For basic orthodontics questions, accuracy increased slightly from 4.27 ± 1.03 to 4.53 ± 0.83 (p = 0.203). For clinically advanced orthodontics questions, accuracy remained stable (first: 3.67 ± 1.05, second: 3.73 ± 0.46, p = 0.850). In terms of repeatability and quality assessed by GQS, basic orthodontic questions scored moderately (first: 3.13 ± 0.83, second: 3.27 ± 0.70, p = 0.580), whereas clinically advanced orthodontics questions received higher GQS scores (first: 4.20 ± 0.77, second: 3.80 ± 0.56, p = 0.095), indicating potential applicability in patient education contexts.
    CONCLUSION: The accuracy and repeatability of ChatGPT's responses varied by question type: basic questions were more accurate, while clinically advanced orthodontic questions resulted in higher repeatability and quality. As these findings are limited to patient education and general information delivery, ChatGPT should not be considered a replacement for professional orthodontic expertise.
    Keywords:  Accuracy; Generative artificial intelligence; Global Quality Score(GQS); Patient instructions; Repeatability
    DOI:  https://doi.org/10.1007/s00056-025-00637-3
  10. J Stomatol Oral Maxillofac Surg. 2026 Jan 26. pii: S2468-7855(26)00025-X. [Epub ahead of print] 102732
       BACKGROUND: The expanding usage of LLM (Large Language Model) for rapid, practical information access has accelerated their integration into medicine and dentistry. Yet inaccuracies and fabricated citations raise concerns about reliability in clinical contexts. This study compares ChatGPT 4.0 and OpenEvidence on questions regarding intentional replantation and examines how endodontists & oral and maxillofacial surgeons evaluate the perceived accuracy and completeness of their responses.
    METHODS: Twenty clinically oriented questions addressing both the endodontic and surgical dimensions of intentional replantation were generated using the "alsoasked.com" tool and subsequently reviewed by specialists in endodontics and oral and maxillofacial surgery. Each question was submitted individually to both models. ChatGPT 4.0 was provided with a specialized prompt to emulate expert-level use, whereas OpenEvidence was queried without additional guidance. The resulting 40 responses were evaluated for FKGL (Flesch-Kincaid Grade Level) readability and textual similarity. Perceived accuracy and completeness were analyzed using linear mixed-effects models, with LLM type and evaluator specialty as fixed effects and rater identity as a random effect.
    RESULTS: Both LLMs produced responses with advanced readability, and no significant difference in FKGL scores was observed (p = 0.092). Textual similarity remained low (0-9%), reflecting a high degree of originality across outputs. The LMM analyses demonstrated that LLM type had a significant main effect on both perceived accuracy and completeness (p < 0.001), with OpenEvidence receiving higher scores than ChatGPT-4.0 for both outcomes. The evaluator's clinical specialty did not show a significant independent main effect. However, a significant interaction between LLM type and clinical specialty was observed for perceived accuracy, whereas this interaction did not reach statistical significance for completeness.
    CONCLUSIONS: OpenEvidence's higher perceived accuracy and completeness suggest domain-specific, structured LLMs may better generate high-quality clinical information. This study also provides a rare cross-specialty comparison of evaluations by endodontists and oral-maxillofacial surgeons.
    Keywords:  ChatGPT 4.0; Chatbot; Endodontics; Intentional Replantation; LLM; OpenEvidence; Oral and Maxillofacial Surgery
    DOI:  https://doi.org/10.1016/j.jormas.2026.102732
  11. J Oral Maxillofac Surg. 2026 Jan 09. pii: S0278-2391(26)00032-7. [Epub ahead of print]
       BACKGROUND: Large language models (LLMs) such as Chat Generative Pre-Trained Transformer (ChatGPT; OpenAI, San Francisco, CA) and Claude (Anthropic, San Francisco, CA) are increasingly used by patients seeking information about surgical procedures, including external sinus lifting. However, the accuracy, quality, and readability of these artificial intelligence (AI)-generated explanations remain uncertain.
    PURPOSE: The study purpose was to measure and compare 2 AI language models regarding the reliability, quality, usefulness, and readability of their responses to frequently asked patient questions about external sinus lifting.
    STUDY DESIGN, SETTING, AND SAMPLE: This cross-sectional study assessed computer-generated responses provided by LLMs, referred to as decoder-only-based LLM (DO-LLM) and transformer-based LLM (TB-LLM) to standardized patient questions.
    PREDICTOR VARIABLE: The predictor variable was AI model type (DO-LLM vs TB-LLM).
    MAIN OUTCOME VARIABLES: Outcome measures included reliability, quality, usefulness, and readability. These were assessed using the modified DISCERN instrument, Global Quality Score, a 4-point usefulness scale, and 2 readability indices (Flesch Reading Ease and Flesch-Kincaid Grade Level). Seventy-two standardized questions across 10 clinical domains were submitted to both models. Responses were independently evaluated by one oral and maxillofacial surgeon and 2 periodontists, followed by consensus scoring.
    COVARIATES: Not applicable.
    ANALYSES: Descriptive statistics summarized outcomes. Depending on normality, comparisons used the independent samples t-test or Mann-Whitney U test. Associations between categorical variables were analyzed using Pearson's χ2 or Fisher's Exact test.
    RESULTS: For modified DISCERN, DO-LLM scored 21.88 (3.09), 22.14 (2.04), and 22.63 (2.56) in preoperative preparation, graft materials, and risks/complications, whereas TB-LLM scored 13.88 (4.52), 17.29 (4.27), and 19 (1.77), respectively (P < .05). For Global Quality Score in lifestyle and behavioral recommendations, TB-LLM scored 4 (0) compared with 3.29 (0.49) for DO-LLM (P < .05). Moderate-quality responses were more common with DO-LLM (56.9%), while TB-LLM produced a higher proportion of good quality scores (29.2%) (P < .05).
    CONCLUSION AND RELEVANCE: Both AI models demonstrated potential value for patient education on external sinus lifting, though their strengths differed by content domain. DO-LLM provided stronger procedural and risk-related explanations, whereas TB-LLM offered more comprehensive lifestyle-related guidance. Continued refinement of dental-specific AI tools and integration of patient-centered considerations remain essential.
    DOI:  https://doi.org/10.1016/j.joms.2026.01.001
  12. Aesthetic Plast Surg. 2026 Jan 29.
       BACKGROUND: Health literacy is an understudied topic among published Spanish resources. Research focused on resources written in English demonstrates that content exceeds the recommended reading level for patients. Given the prevalence of breast reduction procedures and the increase in diverse patients undergoing the procedure in the last few years, this study explores the readability of Spanish information discussing breast reduction surgery by private and academic organizational websites.
    METHODS: Using a de-identified Google search engine, we identified the first 20 Spanish websites that provided breast reduction information. Two independent reviewers used the Patient Education and Materials Assessment Tool (PEMAT) and Cultural Sensitivity Assessment Tool (CSAT) to assess understandability, actionability, and cultural sensitivity of each website. The Spanish SMOG readability formula (SOL), Gilliam-Peña-Mountain, the Fry Readability Adaptation for Spanish Evaluation (FRASE), and Crawford assessments were used to assess readability.
    RESULTS: Both private- and academic-based websites scored above 70% for understandability but had lower scores for actionability. CSAT scores were just marginally above the threshold. Both private and academic readability assessments revealed consistently high reading grade levels 2 ranging from ninth to eleventh grade, except Crawford scores, which assessed a mean reading level of 6th grade.
    CONCLUSION: Websites displaying Spanish content exceed the recommended level for patient educational materials. While the average understandability scores may be satisfactory on some websites, many have room for improvement, specifically regarding actionability. A limited sample size also emphasized the need to advocate for institutions to cater to patients who speak languages other than English. Important Points: Online Spanish resources on breast reduction are often too complex and exceed the recommended reading level for patients. Spanish resources need to be more than simple English translation to promote more cultural sensitivity. There is a significant gap in published online Spanish resources on breast reduction, particularly from academic and private organizations.
    LEVEL OF EVIDENCE V: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
    Keywords:  Breast reduction; Cultural sensitivity; Health literacy; Online resources; Spanish
    DOI:  https://doi.org/10.1007/s00266-026-05628-2
  13. Cureus. 2025 Dec;17(12): e99969
      Introduction Artificial intelligence (AI) chatbots are increasingly being used to create patient education guides (PEGs). However, there are gaps in the literature comparing the latest version in terms of readability, reliability, and similarity. The aim of this study was to compare PEGs generated by ChatGPT 5.1 (OpenAI, San Francisco, California, US) and Gemini 3 Pro (Google LLC, Mountain View, CA, USA) for five common urological conditions, kidney stone, urinary tract infection, urinary retention, erectile dysfunction, and benign prostatic hyperplasia, across these domains.  Methods This cross-sectional study analysed PEGs generated by both AI chatbots for five common urological conditions using identical prompts. Readability was assessed using the Flesch Reading Ease Score and Flesch-Kincaid Grade Level. Reliability and similarity were assessed using a modified DISCERN score and Turnitin, respectively. Statistical comparison was performed using the Mann-Whitney U test. Results None of the evaluated characteristics showed a statistically significant difference between the PEGs generated by AI chatbots.  Conclusion PEGs generated by both AI chatbots exceeded the recommended reading level, demonstrated limited originality, and showed moderate reliability, highlighting the need for professional oversight. Continued refinement of AI chatbots is necessary before integrating AI-generated PEGs into routine patient education.
    Keywords:  artificial intelligence; chatgpt; gemini; patient education; urology
    DOI:  https://doi.org/10.7759/cureus.99969
  14. Clin Pract. 2025 Dec 25. pii: 2. [Epub ahead of print]16(1):
       BACKGROUND: Patient-specific instrumentation (PSI) in total knee arthroplasty represents an increasingly relevant component of personalized surgical planning. As nearly half of orthopedic patients search online for medical information before or after clinical consultation, the quality, accuracy, and readability of publicly available digital resources directly influence patient expectations, shared decision-making, and rehabilitation engagement. This study assessed the content, quality, and readability of online information about PSI in TKA.
    METHODS: Google searches using four predefined PSI-related terms were conducted on 6 March 2025. After applying exclusion criteria, 71 websites were included for evaluation. Websites were categorized as academic or non-academic and analyzed for authorship, reporting of advantages and disadvantages, inaccurate assertions, use of peer-reviewed references, multimedia content, and mention of specific PSI platforms. Website quality was assessed using validated quality evaluation tools (QUEST and JAMA criteria), and readability was evaluated using established readability indices (SMOG, FKGL, and FRE).
    RESULTS: Academic websites demonstrated significantly higher quality than non-academic sources based on QUEST (25.4 vs. 9.8; p < 0.001) and JAMA criteria (3.7 vs. 1.7; p < 0.001). Disadvantages of PSI were reported in 69.1% of academic sites versus 12.5% of non-academic sites (p < 0.001). Inaccurate claims occurred in 31.3% of non-academic sites but were absent in academic sources (p < 0.001). Peer-reviewed references were present in 81.8% of academic websites and only 12.5% of non-academic sites (p < 0.001). Readability was uniformly poor across all websites, with no significant group differences (mean SMOG 13.5; mean FKGL 11.8; mean FRE 32.4).
    CONCLUSIONS: Online information about PSI in total knee arthroplasty varies widely in transparency and accuracy, with non-academic websites frequently omitting risks or presenting misleading claims. Given the role of individualized implant planning, accessible and evidence-based digital content is essential to support personalized patient education and shared decision-making. Because limited readability restricts patient comprehension and informed participation in personalized orthopedic care, improving the clarity and accessibility of digital patient resources is essential.
    Keywords:  health literacy; knee arthroplasty; online health information; patient education; patient-specific instrumentation; shared decision-making
    DOI:  https://doi.org/10.3390/clinpract16010002
  15. Clin Pediatr (Phila). 2026 Jan 30. 99228251415210
      Health literacy and language impact comprehension of and adherence to written educational materials, including those for gastrostomy tubes (g-tubes). Our objective was to evaluate the readability, understandability, actionability, content, and language availability of a national sample of written g-tube educational materials. We conducted a cross-sectional study of g-tube educational materials from top 20 children's hospitals (US News and World Report) obtained via a systematic online search and provided by the institutions. We assessed material: (1) readability, (2) understandability and actionability (Patient Education Materials Assessment Tool for Printable Materials, (3) content, and (4) language availability. Mean (standard deviation [SD]) reading grade level was 8.3 (1.9). Mean (SD) understandability and actionability scores were 81.6% (12.1%) and 65.9% (23.2%), respectively. Materials covered a mean (SD) of 46.1% (25.3%) of content items; 20% of institutions provided materials in non-English languages. Future research should examine how to improve educational materials for children with g-tubes.
    Keywords:  children with medical complexity; gastrostomy tubes; health literacy; patient education; pediatric hospital medicine
    DOI:  https://doi.org/10.1177/00099228251415210
  16. Cureus. 2025 Dec;17(12): e99993
       OBJECTIVE: This study aimed to evaluate the readability, quality, and reliability of online patient education materials (PEM) related to sarcopenia.
    MATERIALS AND METHODS:  In August 2024, 278 websites obtained by searching the term 'sarcopenia' on Google were evaluated. After applying the exclusion criteria, 66 websites were included in the study. The ranks of the websites were evaluated with Blexb (New York, USA), readability evaluations were done with calculators, and quality and reliability were evaluated per the Journal of the American Medical Association (JAMA), Global Quality Scale (GQS), and modified Decision Index for Systematic Consumer Evaluation of Reviewed Narratives (DISCERN) criteria.
    RESULTS:  The readability values of the contents were compared with the 6th-grade level (GRL), and all results were significantly higher than the 6th GRL (p<0.01). The median values in the readability assessment of the 66 websites included in the study were as follows: Flesch Reading Ease Score (FRES) 40.81 (6.10-72.84), Gunning Fog (GFOG) 14.78 (9.10-22.20), Flesch-Kincaid Grade Level (FKGL) 12.18 (7.16-18.58), Coleman-Liau (CL) score 12.89 (7.14-18.45), Simple Measure of Gobbledygook (SMOG) 12.34 (7.77-17.65), Automated Readability Index (ARI) 12.49 (7.70-19.64), Linsear Write (LW) 13.30 (6.89-25.05), and GRL 15.25 (8.00-23.00). The types of websites were compared in terms of readability levels, and no significant difference was found. When the website typologies were compared, no significant difference was observed in GQS, JAMA, and DISCERN. A low-level negative correlation was found between Blexb rank and JAMA. The correlations between Blexb rank and readability scales were as follows: weak negative correlation with FRES and ARI scores and weak positive correlation with GFOG, FKGL, CL, and SMOG scores (p<0.05).
    CONCLUSION: The best category in terms of readability, reliability, and quality was determined to be health portals, but none of the websites had readability at or below the recommended 6th GRL.
    Keywords:  patient education materials; quality; readability; reliability; sarcopenia
    DOI:  https://doi.org/10.7759/cureus.99993
  17. J Hum Nutr Diet. 2026 Feb;39(1): e70207
       AIMS: To quantify the content, understandability, actionability, readability, and overall quality of food chemical intolerance dietary information for patients available online and via mobile applications.
    METHODS: Content analysis was undertaken between August 2023 and August 2024 of eligible webpages and mobile applications. Online material was evaluated for quality using the DISCERN tool, clarity using the Centre for Disease Control and Prevention's Clear Communication Index and health literacy demand using the Patient Education Material Assessment Tool and the Hemingway readability calculator. Mobile applications were evaluated for quality using the Mobile App Rating Scale.
    RESULTS: A total of 169 websites and four mobile applications were eligible for analysis. Almost all (95%) resources recommended an elimination diet for the management of food chemical intolerance, but only 56% advised food reintroduction. Overall, diet information regarding food chemical intolerance found online and in mobile applications was mostly of poor quality. Information found online was also of low clarity and written with a high health literacy demand. Of the information scored as being of good quality, these were written by health/medical organisations and dietitians.
    CONCLUSION: Future revision and development of online and mobile applications should aim to improve the quality and reduce the health literacy demand of food chemical intolerance diet information. Additionally, online information and applications should include food reintroduction instructions, as prolonged dietary restrictions, especially without supervision, may increase the risk of nutrient deficiencies and disordered eating behaviours, and impact quality of life.
    Keywords:  e‐health; food additive; food hypersensitivity; food intolerance; health literacy; nutrition education
    DOI:  https://doi.org/10.1111/jhn.70207
  18. JMIR Infodemiology. 2026 Jan 28. 6 e83900
       Background: Patients increasingly rely on short-video platforms for information regarding in vitro fertilization (IVF), yet the relationship between the scientific quality of this content and its algorithmic dissemination remains unclear.
    Objective: This study aimed to assess the quality, reliability, and key drivers of dissemination of IVF-related short videos on major Chinese social media platforms.
    Methods: A cross-sectional content analysis was conducted on 300 popular IVF-related videos (the top 100 results from each platform) retrieved from Douyin, Bilibili, and Xiaohongshu between January 10 and 15, 2025. Video quality and reliability were evaluated using the Global Quality Score and a modified DISCERN instrument. Predictors of video dissemination were identified using an Extreme Gradient Boosting machine learning model, with the number of "likes" serving as the primary outcome variable.
    Results: Content produced by medical professionals demonstrated significantly higher quality and reliability (median mDISCERN 11.0, IQR 9.0-15.0) compared to non-medical sources (median mDISCERN 8.0, IQR 5.0-13.0; P< .001). However, the Extreme Gradient Boosting analysis identified the uploader's follower count as the most powerful predictor of video "likes." In contrast, quality metrics (Global Quality Score and modified DISCERN scores) had a negligible impact on dissemination.
    Conclusions: In the current Chinese social media landscape, the dissemination of IVF-related videos is strongly associated with creator influence rather than scientific merit. This disconnect between engagement and quality poses a potential risk of misinformation, highlighting the need for medical professionals to adopt platform-native communication strategies to ensure that high-quality information reaches patients.
    Keywords:  content quality ; health communication ; in vitro fertilization ; misinformation; social media
    DOI:  https://doi.org/10.2196/83900
  19. Eur J Cardiovasc Nurs. 2026 Jan 29. pii: zvag025. [Epub ahead of print]
       AIM: This study aims to evaluate the quality, reliability, and content of YouTube videos on cholesterol management, analyze their impact on viewers through comment assessment, and contribute to understanding the influence of digital media on health-related decision-making.
    METHODS AND RESULTS: A descriptive analysis was conducted on YouTube videos obtained using four search queries: "high cholesterol treatment, " "high cholesterol drugs, " "high cholesterol medications, " and "high cholesterol pills." 105 videos meeting the inclusion criteria were analyzed. The videos were scored using the Global Quality Scale (GQS), Modified DISCERN, and JAMA benchmark criteria. Sentiment analysis of comments was conducted, and the presence of misinformation, marketing content, and alternative therapy emphasis was evaluated.The analyzed videos amassed 32,565,336 views, with over 50,000 comments. Notably, 18% of the videos rejected cholesterol-lowering medications, and 21% contained misinformation. Videos emphasizing alternative therapies were more likely to include marketing content (70%) and misinformation (43%). Most videos had low to moderate quality and reliability. Independent doctors were the most frequent publishers (55.2%), but their videos exhibited the highest rate of misinformation (29%).
    CONCLUSION: YouTube serves as a widely used yet unreliable source of health information on cholesterol treatments. The prevalence of misinformation, particularly in videos by healthcare professionals, is remarkable. Proactive efforts by credible institutions to disseminate accurate content on this platform are critical.
    Keywords:  YouTube; cholesterol; misinformation
    DOI:  https://doi.org/10.1093/eurjcn/zvag025
  20. Front Digit Health. 2025 ;7 1612749
       Background: Knee osteoarthritis (KOA) is a chronic joint disorder that significantly affects the quality of life in the older adult. It is primarily characterized by notable knee pain following activity, which typically alleviates with rest. With the rapid growth of the internet, people increasingly rely on social media to obtain health-related information. Short-form videos, as an emerging format, play an important role in information dissemination. TikTok is currently the world's most downloaded application platform primarily dedicated to short-form video content. Against this backdrop, we observed a substantial number of KOA-related videos on TikTok, the quality and reliability of which have not yet been systematically evaluated.
    Objective: To assess the quality and reliability of KOA-related videos available on the domestic TikTok platform.
    Methods: A total of 100 KOA-related videos were retrieved and screened from TikTok. Basic metadata were extracted, and video content and format were categorized through coding. The source of each video was also documented. Two independent raters evaluated video quality using the DISCERN instrument, the Journal of the American Medical Association (JAMA) benchmark criteria, and the Global Quality Score (GQS).
    Results: Of 100 analyzed videos, 96 were posted by medical staff and 4 by science communicators. Eighty videos were audio-based (41% outpatient daily, 39% general science popularization), with others using graph-text formats. The video content is divided into 7 groups: disease prevention, diagnosis, symptoms, description, life-style and therapy, among which the video related to disease description is the most. The average DISCERN, JAMA and GQS scores of videos were 36.29, 1.24 and 2.45, respectively, and the overall quality was low. Further analysis shows that there are significant differences in video quality between science communicators and medical staff. The number of likes, comments, collections, and shares are strongly positively correlated with each other, and they are weakly positively correlated with the number of upload days and DISCERN scores.
    Conclusion: KOA-related content on TikTok demonstrates concerning quality limitations, with significant variation across source types. Given TikTok's expanding influence in health communication, urgent improvements and standardized quality control measures are needed.
    Keywords:  TikTok; health communication; knee osteoarthritis; patient education; social media
    DOI:  https://doi.org/10.3389/fdgth.2025.1612749
  21. Digit Health. 2026 Jan-Dec;12:12 20552076261415944
       Objective: This study was designed to systematically evaluate the reliability and quality of content related to sleep disorders on four leading Chinese short-form video platforms (TikTok, Bilibili, Kwai, and Xiaohongshu) to inform strategies to improve the dissemination of accurate health information in China.
    Methods: A cross-sectional analysis was conducted in March 2025, in which 400 short videos related to sleep disorders were identified and included. These videos were published between 2022 and 2025 on four major short-form video platforms. Video quality was assessed using the Global Quality Scale (GQS), JAMA benchmark criteria (JAMA), and a modified version of the DISCERN tool. Influencing factors were examined using Spearman correlation analysis and Poisson regression, with all multiple comparisons adjusted using the Bonferroni correction.
    Results: Video popularity metrics and quality scores (GQS, JAMA, modified DISCERN) were significantly higher on TikTok than on other platforms (Adjusted P < .001). The quality of videos produced by physicians (64%) was better than that of laypersons (Adjusted P < .005), with the quality of videos produced by Traditional Chinese Medicine (TCM) practitioners significantly higher than that of Clinical medicine practitioners (Adjusted P < .05). The highest quality scores were found for videos presented as expert commentary (52%). After Bonferroni correction, quality differences among specific "disease knowledge" categories were no longer statistically significant (Adjusted P > .05). Although user interaction metrics showed a statistically significant correlation with quality scores (Adjusted P < .001), the Spearman correlation coefficients were weak to moderate (r = .18-.33), indicating the practical association was limited.
    Conclusions: Information on sleep disorders on Chinese short-video platforms is of low quality. These findings are specific to the Chinese digital ecosystem and may not be generalizable to global platforms. Content posted by medical professionals (especially TCM practitioners) was associated with higher reliability scores, suggesting a potential role for professional oversight in improving the quality of sleep disorder information on short video platforms.
    Keywords:  Sleep disorders; health information; quality and reliability assessment; short videos
    DOI:  https://doi.org/10.1177/20552076261415944
  22. Menopause. 2026 Jan 27.
       OBJECTIVES: This study aims to evaluate the quality and reliability of short videos related to premature ovarian failure on two major Chinese short video platforms, TikTok and Bilibili.
    METHODS: A total of 231 videos related to premature ovarian failure (133 from TikTok and 98 from Bilibili) were analyzed up until March 25, 2025. The video quality was evaluated using the Global Quality Scale (GQS), the modified DISCERN instrument (mDISCERN), and the Journal of the American Medical Association (JAMA) scoring system. Creator categories, content categories, duration, and interaction metrics (likes, comments, shares) were collected and statistically analyzed.
    RESULTS: In the overall correlation analysis, there was a high positive correlation between interaction metrics (r>0.7, P<0.05), whereas no significant correlation was found with video duration. A weak correlation was observed between quality scores and interaction metrics. TikTok was dominated by professional individuals (90.23% were verified users), and the content was primarily disease-related (67.67% was knowledge-based), whereas Bilibili was mainly composed of nonprofessional individuals (76.53%) with more diversified themes (such as lifestyle content accounting for 35.71%). The quality and reliability scores of TikTok videos were significantly higher than those of Bilibili (GQS median: 3.0 vs. 2.0; mDISCERN: 3.0 vs. 2.0; JAMA score: 1.0 vs. 0.0; P<0.001). TikTok videos were significantly shorter in duration than Bilibili videos (P<0.001), and interaction metrics (likes, comments, shares, favorites) were significantly higher.
    CONCLUSIONS: TikTok performs better than Bilibili in terms of the dissemination of information on premature ovarian failure on online video platforms, although the overall quality is not ideal. The quality of videos uploaded by verified medical professionals can be considered relatively reliable. Optimizing platform algorithms to prioritize content from verified creators and standardizing content guidelines are crucial for information seekers to make informed medical decisions and improve public health literacy.
    Keywords:  Popularization of science; Premature ovarian failure; Quality and reliability; Short video.
    DOI:  https://doi.org/10.1097/GME.0000000000002660
  23. Digit Health. 2026 Jan-Dec;12:12 20552076261417853
       Background: The incidence and mortality of renal cell carcinoma (RCC) have risen significantly in recent years, attracting considerable public attention. Short-video platforms such as TikTok and Bilibili have become important sources of health information, yet the quality and reliability of RCC-related content on these platforms remain unclear.
    Methods: On August 31, 2025, we systematically retrieved the top 110 videos related to RCC from both TikTok and Bilibili using the keyword "RCC." Basic video characteristics were extracted, and two validated instruments-the Global Quality Scale (GQS) and the modified DISCERN (mDISCERN)-were employed to evaluate video quality and reliability, respectively. Spearman correlation analysis was used to examine relationships between engagement metrics and quality scores.
    Results: Of 196 videos included, TikTok content was predominantly from medical professionals (86.0%), while Bilibili had more non-professional uploads (53.13%). TikTok videos demonstrated significantly higher median scores than Bilibili in both GQS (3 [IQR: 2, 4] vs. 2 [IQR: 2, 3], P < .001) and mDISCERN (2 [IQR: 2, 3] vs. 2 [IQR: 1, 2], P < .001). Nevertheless, the median GQS score of 3 indicates only moderate quality, and the median mDISCERN score of 2 reflects a relatively low level of reliability. Videos from healthcare professionals, especially RCC specialists, scored higher in quality (GQS: 3 [IQR: 3, 4]) and reliability (mDISCERN: 3 [IQR: 2, 3]) than non-professional sources (P < .001). Disease knowledge videos scored highest, while advertisements scored lowest. Engagement metrics showed weak negative or non-significant correlations with quality scores.
    Conclusion: The overall quality and reliability of RCC-related short videos on TikTok and Bilibili are suboptimal. Content from medical professionals is more trustworthy, highlighting their essential role in public health education. These findings underscore the need for enhanced content oversight on platforms and critical discernment among viewers when accessing health information online.
    Keywords:  Bilibili; Renal cell carcinoma; TikTok; global quality score; information quality; modified DISCERN; reliability; short videos; social media
    DOI:  https://doi.org/10.1177/20552076261417853
  24. Digit Health. 2026 Jan-Dec;12:12 20552076261418829
       Objective: This study aimed to evaluate the quality and reliability of information presented in short videos related to anterior cruciate ligament (ACL) injuries on two major Chinese social media platforms, TikTok and Bilibili.
    Methods: A systematic search using the keyword "ACL injuries" was conducted to identify the top 100 Chinese videos on TikTok and Bilibili, respectively. The Global Quality Score (GQS) and the modified DISCERN evaluation scale were employed to assess video content reliability and quality. Videos characteristics-including engagement metrics, uploader identity, video length, and content type-were also gathered. Statistical analyses were conducted to examine differences and correlations between platforms, uploader categories, and video quality.
    Results: Out of 200 videos reviewed, 175 met inclusion criteria. The most common content theme was treatment, found in 59 videos (33.71%). TikTok videos attracted higher user engagement than Bilibili. However, the overall video quality on both platforms was moderate. TikTok videos scored higher on GQS and modified DISCERN than on Bilibili. Engagement on TikTok showed no positive correlation with content quality, while that on Bilibili demonstrated a moderate positive correlation. Videos uploaded by healthcare professionals were more popular but often tended to be shorter in duration. Notably, videos uploaded by individual users often achieved quality scores comparable to, or even exceeding, those of medical professionals and science communicators.
    Conclusion: TikTok demonstrated higher engagement than Bilibili, but both platforms showed limited quality and reliability in ACL injury-related video content. No strong correlation was observed between video content quality and engagement. These findings highlight the need for improved oversight of ACL injury-related information disseminated through short video platforms in China.
    Keywords:  Anterior cruciate ligament injury; Bilibili; TikTok; quality; reliability; short videos
    DOI:  https://doi.org/10.1177/20552076261418829
  25. Digit Health. 2026 Jan-Dec;12:12 20552076261415927
       Background: Cervical cancer remains a serious global threat to women's health, with rising incidence and younger demographic impact, challenging reproductive health. Short-video platforms have become key public sources for health information due to digital health communication advances, yet the scientific accuracy and reliability of their cervical cancer content are widely questioned. A systematic evaluation of its quality and dissemination patterns is lacking.
    Objective: This cross-sectional study assessed cervical cancer-related videos on YouTube, TikTok, and Bilibili, examining content breadth, information quality, and dissemination impact.
    Methods: Videos were systematically retrieved in July 2025 using "cervical cancer" keywords across the three platforms. After applying inclusion/exclusion criteria, 201 videos were analyzed. Quality, reliability, and educational value were evaluated using the Global Quality Score (GQS), modified DISCERN, Patient Education Materials Assessment Tool (PEMAT-assessing understandability and actionability), and Journal of the American Medical Association (JAMA) benchmark criteria. Platform differences were compared using the Kruskal-Wallis H test (significance p < 0.05).
    Results: Platform differences emerged: YouTube videos demonstrated the highest quality (GQS mean 3.47 ± 1.06 vs. Bilibili 2.85 ± 0.89, TikTok 3.09 ± 0.75; p = 0.001) and significantly higher PEMAT understandability (76.94 ± 10.43 vs. TikTok 70.14 ± 11.07; p < 0.001). TikTok had the strongest dissemination power. Content coverage was inadequate: only 50.2% mentioned screening, 33.3% covered human papillomavirus vaccination, and a mere 8.0% recommended male vaccination. Creator expertise significantly influenced outcomes: Professionals (doctors/researchers) had higher JAMA authority scores and PEMAT actionability. Patient-created videos generated the highest interaction but scored lowest on quality metrics.
    Conclusion: Cervical cancer information quality on short-video platforms is uneven. YouTube offers the highest overall quality, while TikTok achieves the widest reach but lacks content depth. Critical prevention information (e.g. male vaccination) has low coverage. Professional creators provide more reliable content but have limited reach. Platforms should enhance promotion of authoritative content and implement quality review mechanisms.
    Keywords:  Bilibili; Cervical cancer; HPV vaccine; TikTok; YouTube; cross-sectional study; health communication; information quality; screening; short videos
    DOI:  https://doi.org/10.1177/20552076261415927
  26. Digit Health. 2026 Jan-Dec;12:12 20552076261416720
       Objective: Stroke remains a huge disease burden source on a global scale due to its high prevalence rate and mortality. Social media platforms serve as significant health-relevant information dissemination channels. However, the role of social media platforms in stroke-relevant information spread has not been established well. The aim of this study is to explore the role of social media platforms in stroke-relevant information spread.
    Methods: To conduct this cross-sectional study, stroke-related videos were collected from YouTube, Bilibili, and TikTok. The quality of included videos was assessed by using the Global Quality Scale (GQS), Journal of the American Medical Association (JAMA), and Modified DISCERN score systems. A guideline-based content analysis was performed to assess the content accuracy and comprehensiveness. Potential positive factors were determined with multiple ordered logistic regression. The dose-relationship between playback time and like was analyzed by employing restricted cubic spline analysis.
    Results: A total of 300 stroke-relevant videos were included for further analysis (YouTube 100; Bilibili 100; TikTok 100). Mean JAMA scores of YouTube videos, Bilibili videos, and TikTok videos were 2.51, 2.62, and 2.76, respectively. Mean GQS scores of YouTube videos, Bilibili videos, and TikTok videos were 3.11, 2.79, and 2.60, respectively. Mean Modified DISCERN score of YouTube videos, Bilibili videos, and TikTok videos were 3.00, 2.88, and 2.78, respectively. No significant difference was found in quality scores across the three platforms. Content analysis suggested that all included videos demonstrated good performance in terms of accuracy and evidence support. Personal experience, health professionals, science communications, general users, news agencies, and nonprofit organizations were identified as potential positive factors for better viewers' level of enjoyment. The video playback time was negatively correlated with the viewers' level of enjoyment.
    Conclusion: Social media platforms facilitate the spread of stroke-relevant information. To enhance viewer engagement, regardless of the platform, video creators should strive to make their videos more concise.
    Keywords:  Stroke; education; online videos; restricted cubic spline; social media
    DOI:  https://doi.org/10.1177/20552076261416720
  27. Front Med (Lausanne). 2025 ;12 1751545
       Background: To systematically evaluate the information quality, reliability, and content characteristics of short videos related to age-related macular degeneration on the Chinese mainland version of TikTok, filling the research gap in this field and providing references for ophthalmic health education and platform information governance.
    Methods: This cross-sectional study was conducted on October 15, 2025, by searching the keyword "" on TikTok. The top 200 videos under "comprehensive ranking" were screened, and 196 videos meeting the eligibility criteria were ultimately included. Content integrity was evaluated following the American Academy of Ophthalmology guidelines and the Goobie framework. Video quality was assessed using the DISCERN instrument and the PEMAT-A/V tool. Statistical analyses were performed in IBM SPSS Statistics 27.0, with inter-rater reliability measured by the intraclass correlation coefficient. Differences among groups and associations between variables were examined using ANOVA and correlation analysis.
    Results: The overall quality of the included videos was moderate, with a median DISCERN tool score of 48.00 and a median Overall quality score of 3.00. The Understandability score was relatively high (median 84.62%), whereas the Actionability score was lower (median 75.00%). Videos uploaded by the Non-Profit group showed the highest quality (mean DISCERN tool score 59.13 ± 2.50), followed by the Medical group. The Non-Medical and For-Profit groups demonstrated the lowest quality, with statistically significant differences among groups (P < 0.05). Quality metrics were moderately positively correlated with user engagement metrics. The correlation coefficients between reliability and engagement were r = 0.48-0.54 (P < 0.05). Video duration showed a mild positive correlation with both quality and engagement (r = 0.23-0.30, P < 0.05). Inter-rater reliability was good (intraclass correlation coefficient = 0.839-0.947, P < 0.001).
    Conclusion: Age-related macular degeneration videos on TikTok showed moderate overall quality, with content emphasizing clinical concerns but neglecting basic knowledge. Information quality varied by uploader source, with non-profit organizations and medical professionals providing most high-quality content. Higher-quality videos tended to receive greater user engagement, suggesting that platform algorithms may preferentially spread better educational material. These findings provide empirical support for improving science communication on this disease and strengthening information quality management on digital platforms.
    Keywords:  TikTok; age-related macular degeneration; health education; social media; video quality
    DOI:  https://doi.org/10.3389/fmed.2025.1751545
  28. J Cutan Med Surg. 2026 Jan 28. 12034754261418260
      
    Keywords:  content analysis; digital literacy; health information quality; hyperhidrosis; social media; tik tok
    DOI:  https://doi.org/10.1177/12034754261418260
  29. Front Psychol. 2025 ;16 1646584
       Purpose: Risk perception significantly impacts how individuals assess risk, make decisions, and behave. While numerous studies have examined risk perception's impact on emergency information seeking behavior, the nature of the association remains unclear.
    Methods: This study established a theoretical framework, and a meta-analysis was conducted to examine risk perception's impact on emergency information seeking behavior. Fifty relevant studies (29,014 participants) covering risk perception and information seeking behavior data in four emergency scenarios were included.
    Results: A significant positive correlation was found between risk perception and emergency information seeking behavior. Further exploratory analysis indicated different impacts of risk perception on information seeking behavior in each type of emergency (natural disasters, public health accidents, and social security emergencies). Health and natural disaster emergencies had a significant positive moderating effect, whereas accidents and social security emergencies had a significant negative moderating effect. We found significant differences in the moderating effects of demographics (national development level and male proportion) and methodology (i.e., publication time, sample collection strategy, and measurement method). Furthermore, we evaluated the publication bias and literature quality to determine the robustness and scalability of the results.
    Conclusion: To the best of our knowledge, we present the first meta-analysis study on risk perception and emergency information seeking behavior, summarizing the rich empirical knowledge on these relationships. This study followed contemporary meta-analysis guidelines and best practices to generate transparent and replicable scientific findings. Our findings can help improve information dissemination's effectiveness in emergency situations and offer a theoretical foundation for strengthening public emergency response capabilities.
    Keywords:  disaster type; information seeking behavior; meta-analysis; risk perception; theoretical framework
    DOI:  https://doi.org/10.3389/fpsyg.2025.1646584
  30. Health Promot J Austr. 2026 Apr;37(2): e70151
       INTRODUCTION: To explore caregivers' (i) priorities and challenges in seeking and using relevant information and services to support their children's mental health, and (ii) preferences for digital solutions that could meet their needs.
    METHODS: Semi-structured interviews were conducted with 13 caregivers experiencing adversity. Transcripts were analysed employing framework analysis.
    RESULTS: When seeking information and support services, caregivers prioritised feeling understood and not being judged. Common challenges were too much or not enough relevant information; lack of time and a 'solution burden' to find support; as well as being unsure how to progress seeking appropriate support. Caregivers used a range of online search strategies to seek information and services to support their child's health needs. They expressed a need for a digital solution that was practical and non-judgemental.
    CONCLUSIONS: Caregivers often sought information and services online but faced challenges with navigating the abundance of resources. A practical and simple online solution providing the right information and services at the right time is critically needed.
    SO WHAT: A digital navigation platform providing evidence-based information and services could assist caregivers in navigating the complex process of seeking support for child and family health and wellbeing.
    Keywords:  child mental health; childhood adversity; digital health; health policy; integrated care; integrated health service; scalability
    DOI:  https://doi.org/10.1002/hpja.70151
  31. Int J Environ Res Public Health. 2026 Jan 22. pii: 134. [Epub ahead of print]23(1):
      Endometriosis affects one in seven women in Australia and is a significant public health concern. Access to appropriate health information is essential for informed decision-making and quality of life, especially for culturally and linguistically diverse (CALD) women who may face additional communication and health literacy barriers. This study explored the information-seeking behaviours and experiences of CALD women living with endometriosis using semi-structured interviews. Through convenience and snowball sampling via social media, eleven women were recruited. Data were analysed using thematic analysis. The results showed that although women often did not view their cultural background as influential, taboos and stigma can shape information-seeking behaviours. Women primarily relied on healthcare professionals, online resources, and other women with endometriosis as information resources. Healthcare professionals were appreciated for providing tailored information, but some were perceived to have limited knowledge of endometriosis, reducing their usefulness. Online information was abundant and easily accessible but often overwhelming and difficult to navigate. Information from other women with lived experience provided both practical insights and validation, though participants recognised its limited transferability to their own circumstances. These findings highlight the need for information pathways, including better patient education through healthcare providers, as well as accessible and evidence-based online resources.
    Keywords:  CALD; cultural diversity; endometriosis; health information; patient education; women’s health
    DOI:  https://doi.org/10.3390/ijerph23010134
  32. medRxiv. 2026 Jan 04. pii: 2026.01.01.25342468. [Epub ahead of print]
      General-purpose large language models (LLMs) like ChatGPT are increasingly used for medical advice despite lacking medical training and frequently producing incorrect or unsafe output. Older adults' health information-seeking behaviors using LLMs remain poorly characterized. We conducted a cross-sectional survey of 574 US adults aged 50+ recruited via Prolific, balanced by sex and race. Participants reported health information sources, ChatGPT and PubMed use, demographics, and health literacy. Most participants (92%) searched online for health information. All had heard of ChatGPT, and 63% used it for medical information, compared to 44% who had heard of PubMed and 39% who used it. Those with inadequate health literacy had higher odds of ChatGPT use for medical advice (AOR 2.36, 95% CI 1.30-4.52) versus those with adequate health literacy. In conclusion, with more than half of older adults using LLMs for medical advice, the development of safer, purpose-trained medical LLMs is warranted.
    DOI:  https://doi.org/10.64898/2026.01.01.25342468
  33. BMC Med Ethics. 2026 Jan 29.
       BACKGROUND: Certain populations face challenges in accessing online health information, leading to the emergence of Proxy Online Health Information Seeking (Proxy OHIS). Due to their close contact with patients and medical knowledge, nursing interns frequently encounter requests from patients to perform Proxy OHIS, which presents various ethical challenges. Currently, no research specifically focuses on nursing interns within the context of their Proxy OHIS activities.
    AIM: This study aims to explore the ethical decision-making processes of nursing interns when engaging in Proxy OHIS for patients during their clinical practice and to analyze the associated ethical challenges, thereby contributing to improved ethical decision-making among nursing students.
    METHODS: This qualitative study was guided by a constructivist paradigm. Rest's Four-Component Model of ethical decision-making served as the theoretical framework for examining the decision-making process. Participants were recruited via purposive sampling from a tertiary Grade A hospital in Zibo between August and September 2025. Individual semi-structured interviews were conducted with eligible participants until data saturation was achieved. The study involved nursing interns at multiple educational levels (diploma, bachelor's, and postgraduate) from eight different universities. Data were transcribed verbatim and analyzed using Graneheim and Lundman's conventional qualitative content analysis approach with NVivo 14.0 software.
    RESULTS: A total of 18 nursing interns participated. Content analysis identified four primary categories: "Identifying ethical issues and risks", "Formulating appropriate ethical judgments", "Balancing multiple ethical motivations" and "Implementing prudent behavioral strategies".
    CONCLUSION: Guided by Rest's Four-Component Model, this study revealed that nursing interns encounter a range of ethical challenges throughout their decision-making process when performing Proxy OHIS for patients. These challenges span the entire process, from initial issue recognition to the implementation of final actions. To address these challenges effectively, interventions should focus on two key areas: ethics and online health information seeking. In the future, nursing education stakeholders can build a comprehensive ethical support system, including establishing structured supervision protocols for handling Proxy OHIS requests.
    Keywords:  Ethical challenges; Ethical decision-making; Nursing interns; Nursing students; Proxy online health information seeking; Qualitative research
    DOI:  https://doi.org/10.1186/s12910-026-01387-6