bims-librar 2024-06-23 papers

bims-librar

Biomed News

on Biomedical librarianship

Issue of 2024–06–23
twenty papers selected by
Thomas Krichel, Open Library Society

Exploring Freely Available Data Tools to Support Open Data and Open Science.
Removing Persistent Barriers to Systematic Searching.
Features of databases that supported searching for rapid evidence synthesis during COVID-19: implications for future public health emergencies.
Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain.
Efficiency and comparability of using new evidence platforms for updating recommendations: Experience with a type-2 diabetes guideline in Colombia.
PubChem synonym filtering process using crowdsourcing.
Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles.
Influence of automated indexing in Medical Subject Headings (MeSH) selection for pharmacy practice journals.
Practices and Barriers in Developing and Disseminating Plain-Language Resources Reporting Medical Research Information: A Scoping Review.
Egestabase - An online evidence platform to discover and explore options to recover plant nutrients from human excreta and domestic wastewater for reuse in agriculture.
Can AI Answer My Questions? Utilizing Artificial Intelligence in the Perioperative Assessment for Abdominoplasty Patients.
Does the Information Quality of ChatGPT Meet the Requirements of Orthopedics and Trauma Surgery?
A Cross-Sectional Analysis of the Readability of Online Information Regarding Hip Osteoarthritis.
Readability of Patient Education Materials in Head and Neck Cancer: A Systematic Review.
Dyslexia Articles Unboxed: Analyzing Their Readability Level.
Quality and Readability of Online Health Information on Common Urologic Cancers: Assessing Barriers to Health Literacy in Urologic Oncology.
Quality controlled YouTube content intervention for enhancing health literacy and health behavioural intention: A randomized controlled study.
YouTube as a source of education in perioperative anesthesia for patients and trainees: a systematic review.
Is YouTube a Useful Source of Information on Pressure Injuries? A Content, Reliability, and Quality Analysis.
TikTok Tracheostomy Video Analysis of Quality, Credibility, and Readability.

J Hosp Librariansh. 2024 ;24(2): 104-111

Exploring Freely Available Data Tools to Support Open Data and Open Science.

Christine Hislop, Katie Pierce Farrier, Elizabeth Roth.

  Librarians support researchers by promoting open science and open data practices. This article explores five freely available tools that support and facilitate open science practices. Open Science Framework provides a platform for project management, data sharing, and data storage. OpenRefine cleans and formats data. DMPTool has templates for data management and sharing plans that comply with funder mandates. The NIH Common Data Elements is a repository for standardized data elements, and finally, the NLM Scrubber is a tool for de-identifying clinical text data. Information professionals can add these tools to their repertoire and share them with researchers at their institution.

Keywords:  Open data; data management; open science

DOI:  https://doi.org/10.1080/15323269.2024.2326787
Am J Nurs. 2024 Jul 01. 124(7): 40-50

Removing Persistent Barriers to Systematic Searching.

Ellen Fineout-Overholt, Deana Hays, Susan Farus-Brown, Mary C Zonsius, Kerry A Milner.

This is the third article in a new series designed to provide readers with insight into educating nurses about evidence-based decision-making (EBDM). It builds on AJN's award-winning previous series-Evidence-Based Practice, Step by Step and EBP 2.0: Implementing and Sustaining Change (to access both series, go to http://links.lww.com/AJN/A133). This follow-up series on EBDM will address how to teach and facilitate learning about the evidence-based practice (EBP) and quality improvement (QI) processes and how they impact health care quality. This series is relevant for all nurses interested in EBP and QI, especially DNP faculty and students. The brief case scenario included in each article describes one DNP student's journey. To access previous articles in this EBDM series, go to http://links.lww.com/AJN/A256.

DOI: https://doi.org/10.1097/01.NAJ.0001025648.00669.e7
BMC Med Res Methodol. 2024 Jun 21. 24(1): 135

Features of databases that supported searching for rapid evidence synthesis during COVID-19: implications for future public health emergencies.

Leah Hagerman, Emily C Clark, Sarah E Neil-Sztramko, Taylor Colangeli, Maureen Dobbins.

   BACKGROUND: As evidence related to the COVID-19 pandemic surged, databases, platforms, and repositories evolved with features and functions to assist users in promptly finding the most relevant evidence. In response, research synthesis teams adopted novel searching strategies to sift through the vast amount of evidence to synthesize and disseminate the most up-to-date evidence. This paper explores the key database features that facilitated systematic searching for rapid evidence synthesis during the COVID-19 pandemic to inform knowledge management infrastructure during future global health emergencies.
METHODS: This paper outlines the features and functions of previously existing and newly created evidence sources routinely searched as part of the NCCMT's Rapid Evidence Service methods, including databases, platforms, and repositories. Specific functions of each evidence source were assessed as they pertain to searching in the context of a public health emergency, including the topics of indexed citations, the level of evidence of indexed citations, and specific usability features of each evidence source.
RESULTS: Thirteen evidence sources were assessed, of which four were newly created and nine were either pre-existing or adapted from previously existing resources. Evidence sources varied in topics indexed, level of evidence indexed, and specific searching functions.
CONCLUSION: This paper offers insights into which features enabled systematic searching for the completion of rapid reviews to inform decision makers within 5-10 days. These findings provide guidance for knowledge management strategies and evidence infrastructures during future public health emergencies.

Keywords:  COVID-19; Databases; Evidence- informed decision making; Public health; Rapid reviews; Repositories; Searching; Systematic reviews

DOI:  https://doi.org/10.1186/s12874-024-02246-x
Syst Rev. 2024 Jun 15. 13(1): 158

Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain.

Fabio Dennstädt, Johannes Zink, Paul Martin Putora, Janna Hastings, Nikola Cihoric.

   BACKGROUND: Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose.
METHODS: LLMs were used as part of an automated system to evaluate the relevance of publications to a certain topic based on defined criteria and based on the title and abstract of each publication. A Python script was created to generate structured prompts consisting of text strings for instruction, title, abstract, and relevant criteria to be provided to an LLM. The relevance of a publication was evaluated by the LLM on a Likert scale (low relevance to high relevance). By specifying a threshold, different classifiers for inclusion/exclusion of publications could then be defined. The approach was used with four different openly available LLMs on ten published data sets of biomedical literature reviews and on a newly human-created data set for a hypothetical new systematic literature review.
RESULTS: The performance of the classifiers varied depending on the LLM being used and on the data set analyzed. Regarding sensitivity/specificity, the classifiers yielded 94.48%/31.78% for the FlanT5 model, 97.58%/19.12% for the OpenHermes-NeuralChat model, 81.93%/75.19% for the Mixtral model and 97.58%/38.34% for the Platypus 2 model on the ten published data sets. The same classifiers yielded 100% sensitivity at a specificity of 12.58%, 4.54%, 62.47%, and 24.74% on the newly created data set. Changing the standard settings of the approach (minor adaption of instruction prompt and/or changing the range of the Likert scale from 1-5 to 1-10) had a considerable impact on the performance.
CONCLUSIONS: LLMs can be used to evaluate the relevance of scientific publications to a certain review topic and classifiers based on such an approach show some promising results. To date, little is known about how well such systems would perform if used prospectively when conducting systematic literature reviews and what further implications this might have. However, it is likely that in the future researchers will increasingly use LLMs for evaluating and classifying scientific publications.

Keywords:  Biomedicine; Large language models; Natural language processing; Systematic literature review; Title and abstract screening

DOI:  https://doi.org/10.1186/s13643-024-02575-4
Medwave. 2024 Jun 17. 24(5): e2781

Efficiency and comparability of using new evidence platforms for updating recommendations: Experience with a type-2 diabetes guideline in Colombia.

Juan Carlos Villar, Luz Angela Torres López, Anamaría Muñoz Flórez, Angela Manuela Balcázar, Laura Parra-Gómez, Edgar Camilo Barrera.

   Introduction: Updating recommendations for guidelines requires a comprehensive and efficient literature search. Although new information platforms are available for developing groups, their relative contributions to this purpose remain uncertain.
Methods: As part of a review/update of eight selected evidence-based recommendationsfor type 2 diabetes, we evaluated the following five literature search approaches (targeting systematic reviews, using predetermined criteria): PubMed for MEDLINE, Epistemonikos database basic search, Epistemonikos database using a structured search strategy, Living overview of evidence (L.OVE) platform, and TRIP database. Three reviewers independently classified the retrieved references as definitely eligible, probably eligible, or not eligible. Those falling in the same "definitely" categories for all reviewers were labelled as "true" positives/negatives. The rest went to re-assessment and if found eligible/not eligible by consensus became "false" negatives/positives, respectively. We described the yield for each approach and computed "diagnostic accuracy" measures and agreement statistics.
Results: Altogether, the five approaches identified 318 to 505 references for the eight recommendations, from which reviewers considered 4.2 to 9.4% eligible after the two rounds. While Pubmed outperformed the other approaches (diagnostic odds ratio 12.5 versus 2.6 to 5.3), no single search approach returned eligible references for all recommendations. Individually, searches found up to 40% of all eligible references (n = 71), and no combination of any three approaches could find over 80% of them. Kappa statistics for retrieval between searches were very poor (9 out of 10 paired comparisons did not surpass the chance-expected agreement).
Conclusion: Among the information platforms assessed, PubMed appeared to be more efficient in updating this set of recommendations. However, the very poor agreement among search approaches in the reference yield demands that developing groups add information from several (probably more than three) sources for this purpose. Further research is needed to replicate our findings and enhance our understanding of how to efficiently update recommendations.

Keywords:  Evidence-based recommendations; clinical practice guidelines; guidelines update; literature search

DOI:  https://doi.org/10.5867/medwave.2024.05.2781
J Cheminform. 2024 Jun 16. 16(1): 69

PubChem synonym filtering process using crowdsourcing.

Sunghwan Kim, Bo Yu, Qingliang Li, Evan E Bolton.

  PubChem ( https://pubchem.ncbi.nlm.nih.gov ) is a public chemical information resource containing more than 100 million unique chemical structures. One of the most requested tasks in PubChem and other chemical databases is to search chemicals by name (also commonly called a "chemical synonym"). PubChem performs this task by looking up chemical synonym-structure associations provided by individual depositors to PubChem. In addition, these synonyms are used for many purposes, including creating links between chemicals and PubMed articles (using Medical Subject Headings (MeSH) terms). However, these depositor-provided name-structure associations are subject to substantial discrepancies within and between depositors, making it difficult to unambiguously map a chemical name to a specific chemical structure. The present paper describes PubChem's crowdsourcing-based synonym filtering strategy, which resolves inter- and intra-depositor discrepancies in synonym-structure associations as well as in the chemical-MeSH associations. The PubChem synonym filtering process was developed based on the analysis of four crowd-voting strategies, which differ in the consistency threshold value employed (60% vs 70%) and how to resolve intra-depositor discrepancies (a single vote vs. multiple votes per depositor) prior to inter-depositor crowd-voting. The agreement of voting was determined at six levels of chemical equivalency, which considers varying isotopic composition, stereochemistry, and connectivity of chemical structures and their primary components. While all four strategies showed comparable results, Strategy I (one vote per depositor with a 60% consistency threshold) resulted in the most synonyms assigned to a single chemical structure as well as the most synonym-structure associations disambiguated at the six chemical equivalency contexts. Based on the results of this study, Strategy I was implemented in PubChem's filtering process that cleans up synonym-structure associations as well as chemical-MeSH associations. This consistency-based filtering process is designed to look for a consensus in name-structure associations but cannot attest to their correctness. As a result, it can fail to recognize correct name-structure associations (or incorrect ones), for example, when a synonym is provided by only one depositor or when many contributors are incorrect. However, this filtering process is an important starting point for quality control in name-structure associations in large chemical databases like PubChem.

Keywords:  Chemical database; Chemical name-structure association; Crowdsourcing; Crowdvoting; Database search; Medical Subject Headings (MeSH); PubChem

DOI:  https://doi.org/10.1186/s13321-024-00868-3
ArXiv. 2024 Feb 05. pii: arXiv:2402.03484v1. [Epub ahead of print]

Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles.

Ashley Shin, Qiao Jin, James Anibal, Zhiyong Lu.

Searching for a related article based on a reference article is an integral part of scientific research. PubMed, like many academic search engines, has a "similar articles" feature that recommends articles relevant to the current article viewed by a user. Explaining recommended items can be of great utility to users, particularly in the literature search process. With more than a million biomedical papers being published each year, explaining the recommended similar articles would facilitate researchers and clinicians in searching for related articles. Nonetheless, the majority of current literature recommendation systems lack explanations for their suggestions. We employ a post hoc approach to explaining recommendations by identifying relevant tokens in the titles of similar articles. Our major contribution is building PubCLogs by repurposing 5.6 million pairs of coclicked articles from PubMed's user query logs. Using our PubCLogs dataset, we train the Highlight Similar Article Title (HSAT), a transformer-based model designed to select the most relevant parts of the title of a similar article, based on the title and abstract of a seed article. HSAT demonstrates strong performance in our empirical evaluations, achieving an F1 score of 91.72 percent on the PubCLogs test set, considerably outperforming several baselines including BM25 (70.62), MPNet (67.11), MedCPT (62.22), GPT-3.5 (46.00), and GPT-4 (64.89). Additional evaluations on a separate, manually annotated test set further verifies HSAT's performance. Moreover, participants of our user study indicate a preference for HSAT, due to its superior balance between conciseness and comprehensiveness. Our study suggests that repurposing user query logs of academic search engines can be a promising way to train state-of-the-art models for explaining literature recommendation.
Res Social Adm Pharm. 2024 Jun 12. pii: S1551-7411(24)00192-X. [Epub ahead of print]

Influence of automated indexing in Medical Subject Headings (MeSH) selection for pharmacy practice journals.

Fernando Fernandez-Llimos, Luciana G Negrão, Christine Bond, Derek Stewart.

   BACKGROUND: The Medical Subject Headings (MeSH) thesaurus is the controlled vocabulary used to index articles in MEDLINE. MeSH were mainly manually selected until June 2022 when an automated algorithm, the Medical Text Indexer (MTI) automated was fully implemented. A selection of automated indexed articles is then reviewed (curated) by human indexers to ensure the quality of the process.
OBJECTIVE: To describe the association of MEDLINE indexing methods (i.e., manual, automated, and automated + curated) on the MeSH assignment in pharmacy practice journals compared with medical journals.
METHODS: Original research articles published between 2016 and 2023 in two groups of journals (i.e., the Big-five general medicine and three pharmacy practice journals) were selected from PubMed using journal-specific search strategies. Metadata of the articles, including MeSH terms and indexing method, was extracted. A list of pharmacy-specific MeSH terms had been compiled from previously published studies, and their presence in pharmacy practice journal records was investigated. Using bivariate and multivariate analyses, as well as effect size measures, the number of MeSH per article was compared between journal groups, geographic origin of the journal, and indexing method.
RESULTS: A total of 8479 original research articles was retrieved: 6254 from the medical journals and 2225 from pharmacy practice journals. The number of articles indexed by the various methods was disproportionate; 77.8 % of medical and 50.5 % of pharmacy manually indexed. Among those indexed using the automated system, 51.1 % medical and 10.9 % pharmacy practice articles were then curated to ensure the indexing quality. Number of MeSH per article varied among the three indexing methods for medical and pharmacy journals, with 15.5 vs. 13.0 in manually indexed, 9.4 vs. 7.4 in automated indexed, and 12.1 vs. 7.8 in automated and then curated, respectively. Multivariate analysis showed significant effect of indexing method and journal group in the number of MeSH attributed, but not the geographical origin of the journal.
CONCLUSIONS: Articles indexed using automated MTI have less MeSH than manually indexed articles. Articles published in pharmacy practice journals were indexed with fewer number of MeSH compared with general medical journal articles regardless of the indexing method used.

Keywords:  Medical subject headings; Periodicals as topic; Pharmacists; Pharmacy; Publications; Terminology as topic

DOI:  https://doi.org/10.1016/j.sapharm.2024.06.003
Patient. 2024 Jun 15.

Practices and Barriers in Developing and Disseminating Plain-Language Resources Reporting Medical Research Information: A Scoping Review.

Avishek Pal, Isabelle Arnet, Bernice Simone Elger, Tenzin Wangmo.

BACKGROUND: The intent of plain-language resources (PLRs) reporting medical research information is to advance health literacy among the general public and enable them to participate in shared decision-making (SDM). Regulatory mandates coupled with academic and industry initiatives have given rise to an increasing volume of PLRs summarizing medical research information. However, there is significant variability in the quality, format, readability, and dissemination channels for PLRs. In this scoping review, we identify current practices, guidance, and barriers in developing and disseminating PLRs reporting medical research information to the general public including patients and caregivers. We also report on the PLR preferences of these intended audiences.
METHODS: A literature search of three bibliographic databases (PubMed, EMBASE, Web of Science) and three clinical trial registries (NIH, EMA, ISRCTN registry) was performed. Snowball searches within reference lists of primary articles were added. Articles with PLRs or reporting topics related to PLRs use and development available between January 2017 and June 2023 were identified. Evidence mapping and synthesis were used to make qualitative observations. Identified PLRs were quantitatively assessed, including temporal annual trends, availability by field of medicine, language, and publisher types.
RESULTS: A total of 9116 PLRs were identified, 9041 from the databases and 75 from clinical trial registries. The final analysis included 6590 PLRs from databases and 72 from registries. Reported barriers to PLR development included ambiguity in guidance, lack of incentives, and concerns of researchers writing for the general public. Available guidance recommendations called for greater dissemination, increased readability, and varied content formats. Patients preferred visual PLRs formats (e.g., videos, comics), which were easy to access on the internet and used short jargon-free text. In some instances, older audiences and more educated readers preferred text-only PLRs. Preferences among the general public were mostly similar to those of patients. Psychology, followed by oncology, showed the highest number of PLRs, predominantly from academia-sponsored research. Text-only PLRs were most commonly available, while graphical, digital, or online formats were less available. Preferred dissemination channels included paywall-free journal websites, indexing on PubMed, third-party websites, via email to research participants, and social media.
CONCLUSIONS: This scoping review maps current practices, recommendations, and patients' and the general public's preferences for PLR development and dissemination. The results suggest that making PLRs available to a wider audience by improving nomenclature, accessibility, and providing translations may contribute to empowerment and SDM. Minimizing variability among available guidance for PLR development may play an important role in amplifying the value and impact of these resources.

DOI: https://doi.org/10.1007/s40271-024-00700-y
MethodsX. 2024 Jun;12 102774

Egestabase - An online evidence platform to discover and explore options to recover plant nutrients from human excreta and domestic wastewater for reuse in agriculture.

Robin Harder, Geneviève S Metson, Biljana Macura, Solveig Johannesdottir, Rosanne Wielemaker, Dan Seddon, Emma Lundin, Abdulhamid Aliahmad, Erik Kärrman, Jennifer R McConville.

  Restoring nutrient circularity across scales is important for ecosystem integrity as well as nutrient and food security. As such, research and development of technologies to recover plant nutrients from various organic residues has intensified. Yet, this emerging field is diverse and difficult to navigate, especially for newcomers. As an increasing number of actors search for circular solutions to nutrient management, there is a need to simplify access to the latest knowledge. Since the majority of nutrients entering urban areas end up in human excreta, we have chosen to focus on human excreta and domestic wastewater. Through systematic mapping with stakeholder engagement, we compiled and consolidated available evidence from research and practice. In this paper, we present 'Egestabase' - a carefully curated open-access online evidence platform that presents this evidence base in a systematic and accessible manner. We hope that this online evidence platform helps a variety of actors to navigate evidence on circular nutrient solutions for human excreta and domestic wastewater with ease and keep track of new findings.

Keywords:  Ecotechnologies; Egestabase; Evidence synthesis; Nutrient circularity; Resource-oriented sanitation; Systematic map

DOI:  https://doi.org/10.1016/j.mex.2024.102774
Aesthetic Plast Surg. 2024 Jun 19.

Can AI Answer My Questions? Utilizing Artificial Intelligence in the Perioperative Assessment for Abdominoplasty Patients.

Bryan Lim, Ishith Seth, Roberto Cuomo, Peter Sinkjær Kenney, Richard J Ross, Foti Sofiadellis, Paola Pentangelo, Alessandra Ceccaroni, Carmine Alfano, Warren Matthew Rozen.

   BACKGROUND: Abdominoplasty is a common operation, used for a range of cosmetic and functional issues, often in the context of divarication of recti, significant weight loss, and after pregnancy. Despite this, patient-surgeon communication gaps can hinder informed decision-making. The integration of large language models (LLMs) in healthcare offers potential for enhancing patient information. This study evaluated the feasibility of using LLMs for answering perioperative queries.
METHODS: This study assessed the efficacy of four leading LLMs-OpenAI's ChatGPT-3.5, Anthropic's Claude, Google's Gemini, and Bing's CoPilot-using fifteen unique prompts. All outputs were evaluated using the Flesch-Kincaid, Flesch Reading Ease score, and Coleman-Liau index for readability assessment. The DISCERN score and a Likert scale were utilized to evaluate quality. Scores were assigned by two plastic surgical residents and then reviewed and discussed until a consensus was reached by five plastic surgeon specialists.
RESULTS: ChatGPT-3.5 required the highest level for comprehension, followed by Gemini, Claude, then CoPilot. Claude provided the most appropriate and actionable advice. In terms of patient-friendliness, CoPilot outperformed the rest, enhancing engagement and information comprehensiveness. ChatGPT-3.5 and Gemini offered adequate, though unremarkable, advice, employing more professional language. CoPilot uniquely included visual aids and was the only model to use hyperlinks, although they were not very helpful and acceptable, and it faced limitations in responding to certain queries.
CONCLUSION: ChatGPT-3.5, Gemini, Claude, and Bing's CoPilot showcased differences in readability and reliability. LLMs offer unique advantages for patient care but require careful selection. Future research should integrate LLM strengths and address weaknesses for optimal patient education.
LEVEL OF EVIDENCE V: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

Keywords:  AI; Abdominoplasty; ChatGPT; LLM; Perioperative

DOI:  https://doi.org/10.1007/s00266-024-04157-0
Cureus. 2024 May;16(5): e60318

Does the Information Quality of ChatGPT Meet the Requirements of Orthopedics and Trauma Surgery?

Adnan Kasapovic, Thaer Ali, Mari Babasiz, Jessica Bojko, Martin Gathen, Robert Kaczmarczyk, Jonas Roos.

   BACKGROUND: The integration of artificial intelligence (AI) in medicine, particularly through AI-based language models like ChatGPT, offers a promising avenue for enhancing patient education and healthcare delivery. This study aims to evaluate the quality of medical information provided by Chat Generative Pre-trained Transformer (ChatGPT) regarding common orthopedic and trauma surgical procedures, assess its limitations, and explore its potential as a supplementary source for patient education.
METHODS: Using the GPT-3.5-Turbo version of ChatGPT, simulated patient information was generated for 20 orthopedic and trauma surgical procedures. The study utilized standardized information forms as a reference for evaluating ChatGPT's responses. The accuracy and quality of the provided information were assessed using a modified DISCERN instrument, and a global medical assessment was conducted to categorize the information's usefulness and reliability.
RESULTS: ChatGPT mentioned an average of 47% of relevant keywords across procedures, with a variance in the mention rate between 30.5% and 68.6%. The average modified DISCERN (mDISCERN) score was 2.4 out of 5, indicating a moderate to low quality of information. None of the ChatGPT-generated fact sheets were rated as "very useful," with 45% deemed "somewhat useful," 35% "not useful," and 20% classified as "dangerous." A positive correlation was found between higher mDISCERN scores and better physician ratings, suggesting that information quality directly impacts perceived utility.
CONCLUSION: While AI-based language models like ChatGPT hold significant promise for medical education and patient care, the current quality of information provided in the field of orthopedics and trauma surgery is suboptimal. Further development and refinement of AI sources and algorithms are necessary to improve the accuracy and reliability of medical information. This study underscores the need for ongoing research and development in AI applications in healthcare, emphasizing the critical role of accurate, high-quality information in patient education and informed consent processes.

Keywords:  artificial intelligence in medicine; chatgpt; large language model; medical information quality; patient education

DOI:  https://doi.org/10.7759/cureus.60318
Cureus. 2024 May;16(5): e60536

A Cross-Sectional Analysis of the Readability of Online Information Regarding Hip Osteoarthritis.

Brandon Lim, Ariel Chai, Mohamed Shaalan.

  Introduction Osteoarthritis (OA) is an age-related degenerative joint disease. There is a 25% risk of symptomatic hip OA in patients who live up to 85 years of age. It can impair a person's daily activities and increase their reliance on healthcare services. It is primarily managed with education, weight loss and exercise, supplemented with pharmacological interventions. Poor health literacy is associated with negative treatment outcomes and patient dissatisfaction. A literature search found there are no previously published studies examining the readability of online information about hip OA. Objectives To assess the readability of healthcare websites regarding hip OA. Methods The terms "hip pain", "hip osteoarthritis", "hip arthritis", and "hip OA" were searched on Google and Bing. Of 240 websites initially considered, 74 unique websites underwent evaluation using the WebFX online readability software (WebFX®, Harrisburg, USA). Readability was determined using the Flesch Reading Ease Score (FRES), Flesch-Kincaid Reading Grade Level (FKGL), Gunning Fog Index (GFI), Simple Measure of Gobbledygook (SMOG), Coleman-Liau Index (CLI), and Automated Readability Index (ARI). In line with recommended guidelines and previous studies, FRES >65 or a grade level score of sixth grade and under was considered acceptable. Results The average FRES was 56.74±8.18 (range 29.5-79.4). Only nine (12.16%) websites had a FRES score >65. The average FKGL score was 7.62±1.69 (range 4.2-12.9). Only seven (9.46%) websites were written at or below a sixth-grade level according to the FKGL score. The average GFI score was 9.20±2.09 (range 5.6-16.5). Only one (1.35%) website was written at or below a sixth-grade level according to the GFI score. The average SMOG score was 7.29±1.41 (range 5.4-12.0). Only eight (10.81%) websites were written at or below a sixth-grade level according to the SMOG score. The average CLI score was 13.86±1.75 (range 9.6-19.7). All 36 websites were written above a sixth-grade level according to the CLI score. The average ARI score was 6.91±2.06 (range 3.1-14.0). Twenty-eight (37.84%) websites were written at or below a sixth-grade level according to the ARI score. One-sample t-tests showed that FRES (p<0.001, CI -10.2 to -6.37), FKGL (p<0.001, CI 1.23 to 2.01), GFI (p<0.001, CI 2.72 to 3.69), SMOG (p<0.001, CI 0.97 to 1.62), CLI (p<0.001, CI 7.46 to 8.27), and ARI (p<0.001, CI 0.43 to 1.39) scores were significantly different from the accepted standard. One-way analysis of variance (ANOVA) testing of FRES scores (p=0.009) and CLI scores (p=0.009) showed a significant difference between categories. Post hoc testing showed a significant difference between academic and non-profit categories for FRES scores (p=0.010, CI -15.17 to -1.47) and CLI scores (p=0.008, CI 0.35 to 3.29). Conclusions Most websites regarding hip OA are written above recommended reading levels, hence exceeding the comprehension levels of the average patient. Readability of these resources must be improved to improve patient access to online healthcare information which can lead to improved patient understanding of their own condition and treatment outcomes.

Keywords:  healthcare education; hip osteoarthritis; internet; online medical education; orthopaedic surgery

DOI:  https://doi.org/10.7759/cureus.60536
JAMA Otolaryngol Head Neck Surg. 2024 Jun 20.

Readability of Patient Education Materials in Head and Neck Cancer: A Systematic Review.

Maria Armache, Sahar Assi, Richard Wu, Amiti Jain, Joseph Lu, Larissa Gordon, Lisa M Jacobs, Christopher E Fundakowski, Kristin L Rising, Amy E Leader, Carole Fakhry, Leila J Mady.

Importance: Patient education materials (PEMs) can promote patient engagement, satisfaction, and treatment adherence. The American Medical Association recommends that PEMs be developed for a sixth-grade or lower reading level. Health literacy (HL) refers to an individual's ability to seek, understand, and use health information to make appropriate decisions regarding their health. Patients with suboptimal HL may not be able to understand or act on health information and are at risk for adverse health outcomes.
Objective: To assess the readability of PEMs on head and neck cancer (HNC) and to evaluate HL among patients with HNC.
Evidence Review: A systematic review of the literature was performed by searching Cochrane, PubMed, and Scopus for peer-reviewed studies published from 1995 to 2024 using the keywords head and neck cancer, readability, health literacy, and related synonyms. Full-text studies in English that evaluated readability and/or HL measures were included. Readability assessments included the Flesch-Kincaid Grade Level (FKGL grade, 0-20, with higher grades indicating greater reading difficulty) and Flesch Reading Ease (FRE score, 1-100, with higher scores indicating easier readability), among others. Reviews, conference materials, opinion letters, and guidelines were excluded. Study quality was assessed using the Appraisal Tool for Cross-Sectional Studies.
Findings: Of the 3235 studies identified, 17 studies assessing the readability of 1124 HNC PEMs produced by professional societies, hospitals, and others were included. The mean FKGL grade ranged from 8.8 to 14.8; none of the studies reported a mean FKGL of grade 6 or lower. Eight studies assessed HL and found inadequate HL prevalence ranging from 11.9% to 47.0%.
Conclusions and Relevance: These findings indicate that more than one-third of patients with HNC demonstrate inadequate HL, yet none of the PEMs assessed were developed for a sixth grade or lower reading level, as recommended by the American Medical Association. This incongruence highlights the need to address the readability of HNC PEMs to improve patient understanding of the disease and to mitigate potential barriers to shared decision-making for patients with HNC. It is crucial to acknowledge the responsibility of health care professionals to produce and promote more effective PEMs to dismantle the potentially preventable literacy barriers.

DOI: https://doi.org/10.1001/jamaoto.2024.1569
J Dev Behav Pediatr. 2024 May-Jun 01;45(3):45(3): e211-e216

Dyslexia Articles Unboxed: Analyzing Their Readability Level.

Yusuke Matsuura, Chung Jaeah.

OBJECTIVE: Dyslexia is characterized by difficulties with fluent word recognition, decoding, or spelling, and it has been linked to family history. Given the impact of dyslexia on broad academic activities and well-being, ensuring that information about dyslexia is accessible to affected children and their families is vital. This study aims to assess the readability levels of dyslexia-related websites, with the hypothesis that such websites should be written at an appropriate readability level to accommodate those who may also have reading challenges.
METHODS: This study analyzes the readability of 50 web articles on dyslexia using 6 readability formulas: Flesch-Kincaid Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog Score, Simple Measure of Gobbledygook (SMOG) Index, Coleman Liau Index, and Automated Readability Index. The search term "What is dyslexia" was used on Google. Each article was analyzed using the online calculation website WebFX. The readability goal for these websites was set at fifth grade, a level recommended for patients with reading challenges.
RESULTS: The study found that among the 50 websites, the lowest median readability score was 11.8 (corresponding to a 12th-grade level) on the SMOG Index, while the highest scores were 15.5 on both the Gunning Fog Score and the Coleman Liau Index (indicative of college-level readability). Almost none of the websites had scores below a fifth-grade level.
CONCLUSION: Most websites related to dyslexia are too complex. Tools such as readability metrics and sentence restructuring by AI can help make the information more accessible and understandable to the stakeholders.

DOI: https://doi.org/10.1097/DBP.0000000000001274
Urol Pract. 2024 Jul;11(4): 670-676

Quality and Readability of Online Health Information on Common Urologic Cancers: Assessing Barriers to Health Literacy in Urologic Oncology.

Julian A Giakas, Michael Zaliznyak, Abigail Kohut-Jackson, Mohammad Mahmoud, Lindsay Lombardo, Eric Ballon-Landa, Zachary Hamilton.

   INTRODUCTION: A growing number of Americans search online for health information related to urologic oncologic care each year. The American Medical Association recommends that medical information be written at a maximum sixth-grade level in order to be comprehensible by the majority of patients. As such, it is important to assess the quality and readability of online patient education material that patients are being exposed to.
METHODS: A Google search was performed using the terms "testicular cancer," "prostate cancer," "kidney cancer," and "bladder cancer," and the top 30 results for each were reviewed. Websites were categorized based on their source. Readability was assessed using the Flesch-Kincaid Grade Level, the Gunning Frequency of Gobbledygook, and the Simple Measure of Gobbledygook indices. Quality was assessed using the DISCERN Quality Index (1-5 scale).
RESULTS: A total of 91 websites were included in our analysis. On average, online health information pertaining to urologic cancers is written at a 10th- to 11th-grade reading level, which is significantly higher than that of an average American adult and that recommended by the American Medical Association (P < .01). The overall quality of websites was 3.4 ± 0.7, representing moderate to high quality. There was no significant difference in readability based on cancer type or information source.
CONCLUSIONS: Despite being of moderate to high quality, online patient education materials related to common urologic cancers are often written at a grade level that exceeds the reading level of an average American adult. This presents as a barrier to online health literacy and calls into question the utility of these resources.

Keywords:  genitourinary cancer; health literacy; online health; reading comprehension; urologic oncology

DOI:  https://doi.org/10.1097/UPJ.0000000000000574
Digit Health. 2024 Jan-Dec;10:10 20552076241263691

Quality controlled YouTube content intervention for enhancing health literacy and health behavioural intention: A randomized controlled study.

Yujin Park, Su Hwan Kim, Hyung-Jin Yoon.

   Background: Individuals increasingly turn to the Internet for health information, with YouTube being a prominent source. However, the quality and reliability of the health information vary widely, potentially affecting health literacy and behavioural intentions.
Methods: To analyse the impact of health information quality on health literacy and behavioural intention, we conducted a randomized controlled trial using a quality-controlled YouTube intervention. Health information quality on YouTube was evaluated using the Global Quality Score and DISCERN. We randomly allocated (1 : 1) to the intervention group to watch the highest quality-evaluated content and to the control group to watch the lowest quality-evaluated content. Health literacy and health behavioural intention were assessed before and after watching YouTube. The trial was set for two different topics: interpreting laboratory test results from health check-up and information about inflammatory bowel disease (IBD).
Results: From 8 April 2022 to 15 April 2022, 505 participants were randomly assigned to watch either high-quality content (intervention group, n = 255) or low-quality content (control group, n = 250). Health literacy significantly improved in the intervention group (28.1 before and 31.8 after; p < 0.01 for health check-up; 28.3 before and 31.3 after; p < 0.01 for IBD). Health behavioural intention significantly improved in the intervention group (3.5 before and 4.1 after; p < 0.01 for health check-up; 3.6 before and 4.0 after; p < 0.01 for IBD). Control groups had no such effect.
Conclusion: High-quality health information can enhance health literacy and behavioural intention in both healthy individuals and those with specific conditions like IBD. It stresses the significance of ensuring reliable health information online and calls for future efforts to curate and provide access to high-quality health content.

Keywords:  Health literacy; YouTube content intervention; health behaviour; health information quality; randomized controlled study

DOI:  https://doi.org/10.1177/20552076241263691
Can J Anaesth. 2024 Jun 20.

YouTube as a source of education in perioperative anesthesia for patients and trainees: a systematic review.

Matthew W Nelms, Arshia Javidan, Ki Jinn Chin, Muralie Vignarajah, Fangwen Zhou, Chenchen Tian, Yung Lee, Ahmed Kayssi, Faysal Naji, Mandeep Singh.

   BACKGROUND: Online video sharing platforms like YouTube (Google LLC, San Bruno, CA, USA) have become a substantial source of health information. We sought to conduct a systematic review of studies assessing the overall quality of perioperative anesthesia videos on YouTube.
METHODS: We searched Embase, MEDLINE, and Ovid for articles published from database inception to 1 May 2023. We included primary studies evaluating YouTube videos as a source of information regarding perioperative anesthesia. We excluded studies not published in English and studies assessing acute or chronic pain. Studies were screened and data were extracted in duplicate by two reviewers. We appraised the quality of studies according to the social media framework published in the literature. We used descriptive statistics to report the results using mean, standard deviation, range, and n/total N (%).
RESULTS: Among 8,908 citations, we identified 14 studies that examined 796 videos with 59.7 hr of content and 47.5 million views. Among the 14 studies that evaluated the video content quality, 17 different quality assessment tools were used, only three of which were externally validated (Global Quality Score, modified DISCERN score, and JAMA score). Per global assessment rating of video quality, 11/13 (85%) studies concluded the overall video quality as poor.
CONCLUSIONS: Overall, the educational content quality of YouTube videos evaluated in the literature accessible as an educational resource regarding perioperative anesthesia was poor. While these videos are in demand, their impact on patient and trainee education remains unclear. A standardized methodology for evaluating online videos is merited to improve future reporting. A peer-reviewed approach to online open-access videos is needed to support patient and trainee education in anesthesia.
STUDY REGISTRATION: Open Science Framework ( https://osf.io/ajse9 ); first posted, 1 May 2023.

Keywords:  YouTube; anesthesia; education; systematic review; videos

DOI:  https://doi.org/10.1007/s12630-024-02791-5
Adv Skin Wound Care. 2024 Jul 01. 37(7): 1-6

Is YouTube a Useful Source of Information on Pressure Injuries? A Content, Reliability, and Quality Analysis.

Cansu Polat Dünya, Öykü Kara, Eylem Toğluk Yiğitoğlu.

OBJECTIVE: To evaluate the comprehensiveness, reliability, and quality of YouTube videos related to pressure injuries.
METHODS: The authors searched YouTube for relevant videos using the keywords "pressure injury", "pressure ulcer", "bedsore", "pressure injury etiology", "pressure injury classification", "pressure injury prevention", "pressure injury risk assessment", and "pressure injury management". Of the 1,023 videos screened, 269 met the inclusion criteria and were included in the study. For each video, the authors recorded the number of views, likes, and comments; the length; and the video upload source. The Comprehensiveness Assessment Tool for Pressure Injuries, the Quality Criteria for Consumer Health Information score, and the Global Quality Score were used to evaluate the comprehensiveness, reliability, and quality of the videos.
RESULTS: The mean length of the 269 videos was 6.22 ± 4.62 minutes (range, 0.18-19.47 minutes). Only 14.5% of the videos (n = 39) were uploaded by universities or professional organizations. Most videos included information about PI prevention (69.5%), followed by PI management (27.9%). The mean comprehensiveness score was 2.33 ± 1.32 (range, 1-5). Nearly half of the videos (49.1%) were not reliable. However, the quality of 43.9% of the videos was somewhat useful. The Quality Criteria for Consumer Health Information mean scores of universities/professional organizations (P < .001), nonprofit healthcare professionals (P = .015), and independent health information channel videos (P = .026) were higher than the mean score of medical advertising/profit companies channel videos.
CONCLUSIONS: This study draws attention to the need for more comprehensive, high-quality, and reliable videos about PIs. It is important that videos on YouTube provide comprehensive and reliable information for patients, caregivers, students, or providers seeking information on PI prevention, assessment, and management.

DOI: https://doi.org/10.1097/ASW.0000000000000172
Cureus. 2024 May;16(5): e60548

TikTok Tracheostomy Video Analysis of Quality, Credibility, and Readability.

Gianfranco Galantini, Ruwaa Samarrai, Amy Hughes, Katherine Kavanagh.

  Objective The goal of this study is to analyze the quality, credibility, and readability of videos on TikTok related to tracheostomy in order to assess the adequacy of the information for patient and parental education purposes. Study design This was a cross-sectional analysis of online content. Methods The social media platform TikTok was explored for videos related to tracheostomy. The search function was utilized with multiple hashtags related to tracheostomy and videos were reviewed and scored for quality, credibility, and readability. Each of the videos was assessed using the DISCERN criteria, JAMA benchmark, and readability score based on text either presented in the video or written in the caption. Pearson's correlation coefficient was calculated for each of the studied parameters. Results The TikTok search bar was queried using multiple hashtags, including "#trach," "#tracheostomy," "#trachea," and "#tracheotomy" for relevant videos from October 14 to October 15, 2021. Overall, 60 videos were selected for complete review and analysis. The total views for all related videos analyzed was 17,712,281. The total likes were 693,812. The videos were primarily posted by non-healthcare professionals making up approximately 72% of all videos. Videos created by physicians generated 63% of all views. The average DISCERN score for each video was 24.83 out of 75. The average Flesch Reading Ease score was 70.59 and the average Flesch-Kincaid Grade level was 5.5. There was a positive DISCERN score and views with R = 0.255 (p = 0.049), positive correlation between DISCERN and likes R = 0.334 (p = 0.009), positive correlation between DISCERN and JAMA R = 0.56 (p=<0.0001), positive correlation between DISCERN and Flesch-Kincaid Grade Level R = 0.330 (p=0.010) and a negative correlation between DISCERN and Flesch Reading Ease Score R = -0.337 (p=0.009). There was also a statistically significant positive correlation between JAMA and Flesch-Kincaid Grade Level R = 0.260 (p=0.045). Conclusion Overall, the quality of the videos on TikTok regarding tracheostomy rated poorly on the DISCERN quality index but included text that was fairly easy to read. Currently, medical videos on TikTok do not meet the quality metrics needed to properly educate the public and should not be used as a primary resource.

Keywords:  credibility; quality; social media; tiktok; tracheostomy

DOI:  https://doi.org/10.7759/cureus.60548