bims-librar Biomed News
on Biomedical librarianship
Issue of 2025–07–20
eighteen papers selected by
Thomas Krichel, Open Library Society



  1. Med Ref Serv Q. 2025 Jul 15. 1-6
      MedPix is a free, open-access, online database developed by the National Library of Medicine that houses a vast collection of over 59,000 medical images, 12,000 patient cases, and 9,000 clinical topics. Designed to support both teaching and self-directed learning, the platform serves a diverse audience including physicians, medical students, educators, and researchers. MedPix offers robust search capabilities as well as case-based learning tools that support Continuing Medical Education.
    Keywords:  Continuing medical education credits (CME); MedPix®; medical imaging; online database; review
    DOI:  https://doi.org/10.1080/02763869.2025.2533768
  2. Data Brief. 2025 Aug;61 111796
      Information Retrieval is crucial in many areas, including Search Engines, Information Systems, and Databases. As an indigenous language, the Sundanese corpus from West Java in Indonesia suffers from limited data availability, especially for Information Retrieval tasks. Previous efforts to build the Sundanese dataset mainly focused on text classification and generation, leaving information retrieval tasks underexplored. To address this gap, we named the AMSunda dataset. The AMSunda dataset was introduced as the first resource designed explicitly for fine-tuning and evaluating embedding models in the Sundanese language. AMSunda dataset consists of two dataset types: (1) triplet data containing a query passage, a positive, and a negative response aimed for fine-tuning embedding models, and (2) BEIR-compatible data structured for evaluating embedding models on retrieval tasks. The dataset consists of 1499 documents generated using GPT-4o-mini LLM, resulting in 7492 triplet passages and 7491 BEIR-format queries. This dataset enables further development of Sundanese-focused models in Information Retrieval.
    Keywords:  Information retrieval; Natural language processing; Sundanese dataset; Sundanese language; Text embedding
    DOI:  https://doi.org/10.1016/j.dib.2025.111796
  3. J Eval Clin Pract. 2025 Aug;31(5): e70206
       BACKGROUND: Authors of systematic reviews must select, among several options, the databases for searching articles for inclusion in their analyses. Google Scholar is readily available, easy to use, and widely accepted for everyday information searches, including scientific research. However, there is no consensus for its use as a resource in systematic reviews.
    METHODS: This study assessed the proportion of systematic reviews that used Google Scholar as a resource, the search strategies used, and the number of potentially missed articles if the search in Google Scholar was omitted and focused on PubMed, Cochrane Library, and Scopus. We also analyzed data from the most recent systematic reviews in clinical medicine indexed in PubMed that listed Google Scholar as one of the resources used for literature searches.
    RESULTS: The term 'Google Scholar' was included in the title and/or abstract of 6.1% of systematic reviews archived by PubMed, compared to 37.5%, 36.1%, 31.8%, 18.5%, and 14.1% for the terms 'PubMed', 'Embase', 'Cochrane', 'Web of Science', and 'Scopus', respectively. Almost all (1029/1030) articles in the results section of the evaluated systematic reviews could be found in Google Scholar searches. If Google Scholar was omitted as a resource, the missed articles were 5% (53/1029). Twenty-one of 50 (42%) of the evaluated systematic reviews did not mention the number of articles identified from Google Scholar searches.
    CONCLUSION: Google Scholar, as the most inclusive resource, should be used along with other established resources for systematic reviews. Advances in artificial intelligence may facilitate its use for this scientific purpose.
    Keywords:  Google Scholar; PubMed; database; literature search; search engines; systematic review
    DOI:  https://doi.org/10.1111/jep.70206
  4. Front Digit Health. 2025 ;7 1614344
       Objectives: Artificial intelligence (AI) chatbots have gained popularity as a source of information that is easily accessed by patients. The best treatment of acute Achilles tendon ruptures (AATR) remains controversial due to varying surgical repair techniques, postoperative protocols, nonoperative treatment options, and surgeon and patient factors. Given that patients will continue to turn towards AI for answers to medical questions, the purpose of this study is to evaluate whether popular AI engines can provide adequate responses to frequently asked questions regarding AATR.
    Methods: Three AI engines (ChatGPT, Google Gemini, and Microsoft Copilot) were prompted for a concise response to ten common questions regarding AATR management. Four board-certified orthopaedic surgeons were asked to assess the responses using a four-point scale. A Kruskal-Wallis test was used to compare the responses between the three AI systems using the scores assigned by the surgeons.
    Results: All three engines provided comparable answers to 7 of 10 questions (70%). Significant differences were noted between the AI systems for three of the ten questions (Question 4, overall p = .027; Question 7, overall p = .043; and Question 10, overall p = .033). post-hoc analyses revealed that Copilot received significantly poorer scores (higher mean ratings) compared to Gemini for Question 4 (adjusted p = .028) and Question 7 (adjusted p = .036), and poorer score compared to ChatGPT for Question 10 (adjusted p = .033).
    Conclusions: AI chatbots can appropriately answer concise prompts about diagnosis and management of AATR. The responses provided by the three AI chatbots analyzed in our study were largely uniform and satisfactory, with only one of the engines scoring lower on three of the ten questions. As AI engines advance, they will become an important tool for patient education in orthopaedics.
    Keywords:  Achilles tendon rupture; ChatGPT; Copilot; Gemini; artificial intelligence; chatbot; patient education
    DOI:  https://doi.org/10.3389/fdgth.2025.1614344
  5. J Health Commun. 2025 Jul 14. 1-6
      This paper explores the impact of artificial intelligence (AI) on information seeking behavior research and practice, including the need to scrutinize existing information seeking theory, challenge the understood behavioral norms, and consider redefining information literacy and information retrieval education. The historical examination spans from the 1950s to the present, with a specific focus on recent developments in health information seeking and the evaluation of medical information sources. Key to this exploration are ongoing debates in healthcare, ethics, and AI and information literacy education, which represent important dimensions of the impact of emerging technology on information-seeking behavior. The insights provided by this research can be useful for both researchers and practitioners, aiding them in navigating the evolving landscape shaped by AI technology.
    Keywords:  AI research; Artificial Intelligence; health behavior; information seeking; information-seeking behavior
    DOI:  https://doi.org/10.1080/10810730.2025.2533820
  6. J Prosthet Dent. 2025 Jul 16. pii: S0022-3913(25)00556-6. [Epub ahead of print]
       STATEMENT OF PROBLEM: Although artificial intelligence (AI) chatbots have been increasingly used to obtain information about smile design, the accuracy, reliability, and readability of such information for laypersons remain unclear.
    PURPOSE: The purpose of this study was to assess the accuracy, reliability, quality, and readability of responses about digital smile design provided by 4 artificial intelligence models: ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot.
    MATERIAL AND METHODS: The most frequently searched questions regarding smile design were identified via Google search and presented to each AI model. Responses were independently evaluated using a 5-point Likert scale for accuracy, the modified DISCERN scale for reliability, the General Quality Scale (GQS) for quality, and the Flesch Reading Score (FRES) for readability. Normality was assessed by the Kolmogorov-Smirnov test, and group differences by the Kruskal-Wallis test with the Dunn post hoc analysis; statistical significance was set at α=.05.
    RESULTS: ChatGPT-4 achieved the highest median accuracy score 5 (4-5), with significant differences among models (P<.05). Copilot demonstrated the highest reliability and quality scores (P<.05), while ChatGPT-3.5 responses were the most readable (P<.05); however, all models produced output classified as difficult to read. Only Copilot and Gemini included source citations in their responses.
    CONCLUSIONS: AI chatbots generally provided accurate and moderately reliable information about smile design, but limited readability and insufficient referencing restrict their value as patient education tools. Enhancements in transparency, scientific clarity, and source citation are needed to improve the clinical utility of chatbot systems. These findings are limited to the evaluated models and topic area, and further research is warranted for broader validation.
    DOI:  https://doi.org/10.1016/j.prosdent.2025.06.030
  7. JMIR Dermatol. 2025 Jul 16. 8 e59054
       Unlabelled: In our study, we developed a GPT assistant with a custom knowledge base for neurocutaneous diseases, tested its ability to answer common patient questions, and showed that a GPT using retrieval augmentation generation can improve the readability of patient educational material without being prompted for a specific reading level.
    Keywords:  API; ChatGPT; GPT assistant; LLMs; NLP; OpenAI; answer; application programming interface; artificial intelligence; custom GPT; cutaneous; dermatology; educational; generative AI; health education; large language model; machine learning; natural language processing; neurocutaneous syndromes; patient education; readability; response; skin
    DOI:  https://doi.org/10.2196/59054
  8. J Surg Res. 2025 Jul 16. pii: S0022-4804(25)00353-1. [Epub ahead of print]313 222-229
       INTRODUCTION: This study aims to evaluate the accessibility of online health resources for nipple reconstruction in English and Spanish to identify areas of improving information access.
    METHODS: A deidentified Google search was conducted using the search phrase "nipple reconstruction" in English and "reconstrucción del pezón" in Spanish. The first ten websites in English and Spanish were included. A quality assessment of these websites was performed using the Patient Education and Materials Assessment Tool, Cultural Sensitivity Assessment Tool, and Simple Measure of Gobbledygook to evaluate understandability and actionability, cultural sensitivity, and readability, respectively. Unpaired t-tests and Chi-square tests were used to analyze differences between the groups.
    RESULTS: English sites scored similarly to Spanish sites on understandability (70.1% versus 71.0%, P = 0.82) and actionability (46.3% versus 37.5%, P = 0.27), although actionability scores were below the acceptable threshold (70%) in both groups. English sites were significantly more culturally sensitive than Spanish sites (60% versus 10%, P < 0.001). English sites had a statistically significant higher average reading grade level compared to Spanish sites (12.3 versus 10.4, P = 0.005); however both groups exceeded recommended reading grade levels for online health resources. For websites from the same organization, English websites tended to be difficult to read; however, more culturally sensitive, compared to Spanish ones.
    CONCLUSIONS: These findings suggest areas of improvement for culturally competent care for reconstruction patients. Improving the readability of online health resources for nipple reconstruction is essential in enabling patients to create informed decisions about their reconstructed breast.
    Keywords:  Actionability; Cultural sensitivity; English; Health literacy; Nipple reconstruction; Readability; Spanish; Understandability
    DOI:  https://doi.org/10.1016/j.jss.2025.06.019
  9. Urology. 2025 Jul 14. pii: S0090-4295(25)00687-9. [Epub ahead of print]
       OBJECTIVE: To assess the accuracy, quality, and readability of online health information for interstitial cystitis/bladder pain syndrome (IC/BPS).
    METHODS: Two search engines, Google and Bing, were queried using the search terms "interstitial cystitis treatment" and "bladder pain syndrome treatment." The first 20 websites from each search were recorded. Duplicate websites between searches were removed and a predetermined set of inclusion and exclusion criteria were applied to screen websites. Two Urogynecology and Reconstructive Pelvic Surgery fellowship-trained Urologists assessed the accuracy of websites on a 1-5 Likert scale. The quality of websites was assessed using the DISCERN tool. The readability of websites was assessed using the Flesch-Kincaid Reading Ease (FKRE), Flesch-Kincaid Reading Level (FKRL), and SMOG indexes.
    RESULTS: After screening, 25 individual websites were included for assessment. The accuracy of websites was high, with a median accuracy rating of 4 (accuracy of 75-99%). The quality of the websites was fair, with a median score of 42 (scale: 1-75). The readability of websites was poor, with a median FKRE of 45.8 (scale: 1-100), median FKRL of 10.6, indicating a 10th-grade reading level, and SMOG of 13, indicating a college reading level.
    DISCUSSION: Accuracy and quality of the top searched IC/BPS websites are adequate, but readability is poor. Further efforts should ensure that online health information is formatted at a reading level of 6th grade or below.
    DOI:  https://doi.org/10.1016/j.urology.2025.07.013
  10. Ceska Gynekol. 2025 ;90(3): 194-203
       OBJECTIVE: This study aimed to assess the reliability and educational value of vaginal natural orifice transluminal endoscopic surgery (vNOTES) hysterectomy videos on YouTube and their suitability for training surgeons.
    MATERIALS AND METHODS: On June 12, 2024, YouTube was searched using the keywords "vNOTES hysterectomy," "TVNOTES hysterectomy," "transvaginal natural orifice transluminal endoscopic hysterectomy," "vNOTES," and "vaginal notes hysterectomy." A total of 73 videos met the inclusion criteria. Viewer engagement metrics, such as time since upload, number of views, likes, dislikes, comments, and video duration were recorded. Ratios such as a view ratio, a like ratio, and Video Power Index (VPI) were calculated. The videos were categorized by the modified Global Quality Scale (GQS) and evaluated based on a scoring system derived from a standardized 10-step vNOTES hysterectomy procedure, with scores ranging from 0 to 15.
    RESULTS: Out of 73 videos, 40 (53.8%) were categorized as poor quality, 13 (17.8%) as moderate, and 20 (27.4%) as good. No significant differences were found between groups in terms of time since upload, views, dislikes, comments, or a like ratio. However, videos in the good-quality group had a significantly higher number of likes and VPI scores. Critical elements such as patient preparation and positioning, setup of the operation room, circumcision of the cervix, and vault closure were inadequately addressed in lower-quality videos. Videos with a didactic voice had significantly more views, likes, and comments than those with music or no sound. No significant correlations were found between video length and engagement metrics.
    CONCLUSION: The majority of vNOTES hysterectomy videos (53.8%) on YouTube lack comprehensive educational content, with only a small fraction deemed appropriate for surgical training. The interest rates of the viewers may not be correlated with the usefulness rates of the videos. Surgeons and organizations should focus on producing high-quality, peer-reviewed instructional videos to improve the educational value of YouTube as a resource.
    Keywords:  Hysterectomy; YouTube; educational technology; natural orifice endoscopic surgery
    DOI:  https://doi.org/10.48095/cccg2025194
  11. J Prosthet Dent. 2025 Jul 14. pii: S0022-3913(25)00552-9. [Epub ahead of print]
       STATEMENT OF PROBLEM: Patients seeking dental implants increasingly use YouTube to access information about such treatments. However, the educational quality and clinical reliability of such content remain inconsistent and largely unverified.
    PURPOSE: The purpose of this study was to evaluate the quality and usefulness of YouTube videos related to immediately placed and loaded implant-supported complete arch prostheses as a source of patient education.
    MATERIAL AND METHODS: An electronic search was conducted on YouTube for English-language videos using 4 predefined search terms and covering uploads between January 2019 and July 2024. Eligible videos were evaluated using the 5-point Global Quality Score (GQS) and a 10-point content score. ANOVA with Bonferroni post hoc tests, chi-squared tests, and Pearson correlation analysis were applied to evaluate differences and associations across video groups and metrics (α=.05).
    RESULTS: A total of 94 videos met the inclusion criteria. Most were rated as low quality, with 76.3% receiving a GQS score of 1% and 93.8% classified as poorly useful. The ANOVA revealed significant differences in GQS and content score among video categories, with higher scores observed in videos retrieved under "implant" and "prosthesis" terms compared with "loading" (P<.05). GQS was strongly correlated with content score (r=0.84; P<.001) but not with viewer engagement. Videos mostly focused on the surgical phase (96.9%).
    CONCLUSIONS: YouTube videos on immediately placed and loaded implant-supported complete arch prostheses were generally of low educational quality, often with omitted critical patient-relevant information. Clinicians should guide patients toward accurate, evidence-based, resources to support informed decision-making.
    DOI:  https://doi.org/10.1016/j.prosdent.2025.06.027
  12. J Allergy Clin Immunol Pract. 2025 Jul 15. pii: S2213-2198(25)00637-3. [Epub ahead of print]
       BACKGROUND: Currently, individuals can access unlimited information about health, disease, diagnosis, and treatment methods through video sharing sites. The YouTube platform is also widely used by both commercial and non-profit institutions and organizations to share health information.
    OBJECTIVE: The study was conducted to examine the videos on the YouTube video platform for Pressure Metered Dose Inhaler (pMDI) practices in terms of usefulness, reliability and quality.
    METHODS: The descriptive study was conducted in January 2024 by analyzing 96 English-language YouTube™ videos found using the keyword "pressure metered dose inhaler." Two independent observers analyzed the videos based on usefulness, Quality Criteria for Consumer Health Information (DISCERN), Global Quality Score (GQS), duration, views, likes, comments, and time since publication (days).
    RESULTS: It was found that 46.9% of the videos were uploaded by health professionals/institutions, 70.8% were very useful, 80.2% had moderate reliability, and 55.2% had good quality content. Videos uploaded by universities/educational institutions/associations/health information websites were more useful (p=.029), while videos uploaded by pharmaceutical companies had more views (p=.037). Video duration showed significant correlations with usefulness (r=0.307; p=0.002), DISCERN (r=0.301; p=0.003), and GQS (r=0.349; p=0.000), while the number of views showed significant correlations with DISCERN (r=0.309; p=0.002) and GQS (r=0.347; p=0.001).
    CONCLUSION: It was observed that the content of pMDI application videos on YouTube was very useful, moderate reliability and a good quality. It can be concluded that YouTube is a suitable platform for videos on pressure metered dose inhaler practices and is advantageous for access to health information.
    Keywords:  Pressure metered dose inhaler; YouTube; social media
    DOI:  https://doi.org/10.1016/j.jaip.2025.07.005
  13. Dent Traumatol. 2025 Jul 16.
      The current study aims to evaluate the quality, accuracy, and reliability of dental trauma videos posted on YouTube. A search was performed on YouTube using the keywords "dental trauma," "tooth injury," "dental injury," and "traumatic dental injury." For each search term, the first 100 videos were recorded, and a total of 400 videos were examined. After applying exclusion criteria, 158 videos were analyzed by two experts. In the current study, the quality, accuracy, and reliability of dental trauma videos on YouTube were evaluated using the m-DISCERN, JAMA, and GQS scales, respectively. In this study, which did not require approval from the ethics committee, a statistically significant positive correlation was found between the number of likes, number of dislikes, number of views, video power index, views rate and likes rate with the m-DİSCERN scores (p = 0.029, p = 0.025, p = 0.007, p = 0.021, p = 0.021, p = 0.025, respectively). There was a statistically significant positive correlation between the number of likes, video length, and interaction index with GQS scores (p = 0.048, p = 0.000, p = 0.005, respectively). A statistically significant positive correlation between the number of dislikes, number of views, video power index, and view rate with JAMA scores. The quality, accuracy, and reliability of YouTube videos on dental trauma are generally low, and healthcare professionals need to produce more video content on YouTube. Short videos receive more views and engagement during emergencies, indicating that YouTube video viewers prefer videos with easier-to-understand content. Viewers of videos can distinguish between high-quality and low-quality information.
    Keywords:  GQS; JAMA; YouTube; dental trauma; m‐DISCERN
    DOI:  https://doi.org/10.1111/edt.13088
  14. Sci Rep. 2025 Jul 11. 15(1): 25042
      Hypertension is a major global health risk, and social media platforms like TikTok play an increasing role in public health communication. This study evaluates the quality of hypertension-related videos on TikTok and examines the relationship between content quality and user engagement. A systematic keyword search identified hypertension-related videos published before February 25, 2025. After applying exclusion criteria, 139 videos were analyzed. Video quality was assessed using the Journal of the American Medical Association (JAMA) benchmark criteria, the Global Quality Scale (GQS), and the modified DISCERN (mDISCERN) instrument. Engagement metrics (likes, comments, shares, and collections) were recorded, and Spearman's correlation and linear regression analyses explored associations between quality and engagement. These videos accumulated 7190634 likes, 3608023 collections, 247602 comments, and 3252697 shares. Most (85.6%, n = 119) were uploaded by health professionals. The most common JAMA score was 2 (74.8%), while the mean GQS score was 3, with 36.7% of videos receiving this rating. The mDISCERN score was most frequently 2 (71.2%). Video duration correlated with JAMA (β = 0.235, p < 0.05), mDISCERN (β = 0.190, p < 0.05), and GQS scores (β = 0.410, p < 0.001). Likes also correlated with JAMA (β = 0.661, p < 0.001), mDISCERN (β = 0.500, p < 0.05), and GQS scores (β = 0.815, p < 0.001). Comments negatively affected GQS scores (β = -0.574, p < 0.05). TikTok is a valuable platform for hypertension education, but content quality is moderate. Extending video duration, improving rigor, and enhancing review mechanisms may improve hypertension-related health communication.
    Keywords:  Hypertension; Patient education; Social media; TikTok; Video quality
    DOI:  https://doi.org/10.1038/s41598-025-08680-1
  15. Healthcare (Basel). 2025 Jun 23. pii: 1492. [Epub ahead of print]13(13):
      Background/objectives: Social media has significantly enhanced access to medical knowledge by enabling rapid information sharing. With YouTube being the second-most popular website, we intended to evaluate the quality of its content as a source of information for patients and relatives for information about cerebral palsy. Methods: The first 30 videos for search terms "Cerebral palsy", "Spastic cerebral palsy", "Dyskinetic cerebral palsy", "Worster-Drought syndrome", and "Ataxic cerebral palsy" were selected for inquiry. Out of 150 films, a total of 83 were assessed with a mixed method approach by two independent raters utilizing evidence-based quality scales such as Quality Criteria for Consumer Health Information (DISCERN), the Journal of the American Medical Association instrument (JAMA), and the Global Quality Score (GQS). Furthermore, audience engagement was analyzed, and the Video Power Index (VPI) was calculated for each video. Results: The mean total DISCERN score excluding the final question (subjective assessment of the video) was 30.5 ± 8.7 (out of 75 points), implying that the quality of the videos was poor. The global JAMA score was 2.36 ± 0.57 between the raters. The mean GQS score reached 2.57 ± 0.78. The videos had statistically higher DISCERN scores when they included treatment options, risk factors, anatomy, definition, information for doctors, epidemiology, doctor as a speaker, and patient experience. Conclusions: YouTube seems to be a poor source of information for patients and relatives on cerebral palsy. The analysis can contribute to creating more engaging, holistic, and informative videos regarding this topic.
    Keywords:  DISCERN; YouTube; cerebral palsy; social media
    DOI:  https://doi.org/10.3390/healthcare13131492
  16. Sci Rep. 2025 Jul 11. 15(1): 25134
      Thyroid eye disease (TED) is an autoimmune condition that commonly impacts patients' visual function, appearance, and psychological well-being. Challenges in TED management include low early detection rates and large variation in treatment response. Video platforms like TikTok and Bilibili are increasingly utilized for health information dissemination, yet the quality of TED treatment content on these platforms varies significantly. This cross-sectional study collected videos on "" from TikTok and Bilibili in March 2025. After applying exclusion criteria, 152 videos (89 from TikTok, 63 from Bilibili) were analyzed. Quality, reliability, and educational value were assessed using the Global Quality Score (GQS), modified DISCERN (mDISCERN), Patient Education Materials Assessment Tool (PEMAT), and Journal of the American Medical Association (JAMA) benchmark criteria. Additionally, the study analyzed video content integrity, uploader identity, and the correlation with user interaction data. TikTok videos scored higher in quality (GQS: 3.00 ± 0.58; mDISCERN: 3.17 ± 0.73) compared to Bilibili (GQS: 2.65 ± 0.65; mDISCERN: 2.21 ± 0.88; p < 0.001). On Bilibili, 62% of videos were uploaded by Traditional Chinese Medicine (TCM) physicians, yet these scored lower in quality. Professionally uploaded content, particularly by ophthalmologists, outperformed non-professional videos (e.g., patient). However, the video interaction metrics uploaded by patients showed better performance. In terms of content, only 27% addressed staged treatment, and 11.8% mentioned risk factor control. Correlation analysis revealed strong correlations between interaction data, but interaction data have no correlation with GQS, mDISCERN, and PEMAT scores. Short video platforms exhibit a dual role in the dissemination of TED treatment information: enhancing public awareness of diseases through professional content while risking misinformation due to inadequate auditing. Recommended interventions include robust platform certification, active involvement of medical organizations in content creation, and public education on prioritizing verified sources.
    Keywords:  Bilibili; Information quality; Patient education; Public health; Social media; Thyroid eye disease; TikTok
    DOI:  https://doi.org/10.1038/s41598-025-11147-y
  17. J Med Internet Res. 2025 Jul 14. 27 e64901
       Background: The recent increase in online health information-seeking has prompted extensive user appraisal of encountered content. Information consumption depends crucially on the quality of encountered information and the user's ability to evaluate it; yet, within the context of web-based, organic search behavior, few studies take into account both these aspects simultaneously.
    Objective: We aimed to explore a method to bridge these two aspects and grant even consideration to both the stimulus (web page content) and the user (ability to appraise encountered content). We examined novices and experts in information retrieval and appraisal to demonstrate a novel approach to studying information foraging theory: stimulus-engagement alignment (SEA).
    Methods: We sampled from experts and novices in information retrieval and assessment, asking participants to conduct a 10-minute search task with a specific information goal. We used an observational and a retrospective think-aloud protocol to collect data within the framework of an interview. Data from 3 streams (think-aloud, human-computer interaction, and screen content) were manually coded in the Reproducible Open Coding Kit standard and subsequently aligned and represented in a tabularized format with the R package {rock}. SEA scores were derived from designated code co-occurrences in specific segments of data within the stimulus data stream versus the think-aloud and human-computer interaction data streams.
    Results: SEA scores represented a meaningful comparison of what participants encountered and what they engaged with. Operationalizing codes as either "present" or "absent" in a particular data stream allowed us to inspect not only which credibility cues participants engaged with with the most frequency, but also whether participants noticed the absence of cues. Code co-occurrence frequencies could thus indicate case-, time-, and context-sensitive information appraisal that also takes into account the quality of information encountered.
    Conclusions: Using SEA allowed us to retain epistemic access to idiosyncratic manifestations of both stimuli and engagement. In addition, by using the same coding scheme and designated co-occurrences across participants, we were able to pinpoint trends within our sample and subsamples. We believe our approach offers a powerful analysis encompassing the breadth and depth of data, both on par with each other in the feat of understanding organic, web-based search behavior.
    Keywords:  credibility; data visualization; digital health literacy; health information; information appraisal; information foraging; information retrieval; information-seeking; methodology; multimodal data
    DOI:  https://doi.org/10.2196/64901