bims-arihec Biomed News
on Artificial intelligence in healthcare
Issue of 2019–11–03
fiveteen papers selected by
Céline Bélanger, Cogniges Inc.



  1. BMC Med. 2019 Oct 29. 17(1): 195
       BACKGROUND: Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice.
    MAIN BODY: Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes.
    CONCLUSION: The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.
    Keywords:  Algorithms; Artificial intelligence; Evaluation; Machine learning; Regulation; Translation
    DOI:  https://doi.org/10.1186/s12916-019-1426-2
  2. JMIR Med Inform. 2019 Oct 31. 7(4): e15980
       BACKGROUND: Clinical trials are an important step in introducing new interventions into clinical practice by generating data on their safety and efficacy. Clinical trials need to ensure that participants are similar so that the findings can be attributed to the interventions studied and not to some other factors. Therefore, each clinical trial defines eligibility criteria, which describe characteristics that must be shared by the participants. Unfortunately, the complexities of eligibility criteria may not allow them to be translated directly into readily executable database queries. Instead, they may require careful analysis of the narrative sections of medical records. Manual screening of medical records is time consuming, thus negatively affecting the timeliness of the recruitment process.
    OBJECTIVE: Track 1 of the 2018 National Natural Language Processing Clinical Challenge focused on the task of cohort selection for clinical trials, aiming to answer the following question: Can natural language processing be applied to narrative medical records to identify patients who meet eligibility criteria for clinical trials? The task required the participating systems to analyze longitudinal patient records to determine if the corresponding patients met the given eligibility criteria. We aimed to describe a system developed to address this task.
    METHODS: Our system consisted of 13 classifiers, one for each eligibility criterion. All classifiers used a bag-of-words document representation model. To prevent the loss of relevant contextual information associated with such representation, a pattern-matching approach was used to extract context-sensitive features. They were embedded back into the text as lexically distinguishable tokens, which were consequently featured in the bag-of-words representation. Supervised machine learning was chosen wherever a sufficient number of both positive and negative instances was available to learn from. A rule-based approach focusing on a small set of relevant features was chosen for the remaining criteria.
    RESULTS: The system was evaluated using microaveraged F measure. Overall, 4 machine algorithms, including support vector machine, logistic regression, naïve Bayesian classifier, and gradient tree boosting (GTB), were evaluated on the training data using 10-fold cross-validation. Overall, GTB demonstrated the most consistent performance. Its performance peaked when oversampling was used to balance the training data. The final evaluation was performed on previously unseen test data. On average, the F measure of 89.04% was comparable to 3 of the top ranked performances in the shared task (91.11%, 90.28%, and 90.21%). With an F measure of 88.14%, we significantly outperformed these systems (81.03%, 78.50%, and 70.81%) in identifying patients with advanced coronary artery disease.
    CONCLUSIONS: The holdout evaluation provides evidence that our system was able to identify eligible patients for the given clinical trial with high accuracy. Our approach demonstrates how rule-based knowledge infusion can improve the performance of machine learning algorithms even when trained on a relatively small dataset.
    Keywords:  clinical trial; electronic medical records; eligibility determination; machine learning; natural language processing
    DOI:  https://doi.org/10.2196/15980
  3. J Gen Intern Med. 2019 Nov 01.
       BACKGROUND: Emergency departments (ED) are becoming increasingly overwhelmed, increasing poor outcomes. Triage scores aim to optimize the waiting time and prioritize the resource usage. Artificial intelligence (AI) algorithms offer advantages for creating predictive clinical applications.
    OBJECTIVE: Evaluate a state-of-the-art machine learning model for predicting mortality at the triage level and, by validating this automatic tool, improve the categorization of patients in the ED.
    DESIGN: An institutional review board (IRB) approval was granted for this retrospective study. Information of consecutive adult patients (ages 18-100) admitted at the emergency department (ED) of one hospital were retrieved (January 1, 2012-December 31, 2018). Features included the following: demographics, admission date, arrival mode, referral code, chief complaint, previous ED visits, previous hospitalizations, comorbidities, home medications, vital signs, and Emergency Severity Index (ESI). The following outcomes were evaluated: early mortality (up to 2 days post ED registration) and short-term mortality (2-30 days post ED registration). A gradient boosting model was trained on data from years 2012-2017 and examined on data from the final year (2018). The area under the curve (AUC) for mortality prediction was used as an outcome metric. Single-variable analysis was conducted to develop a nine-point triage score for early mortality.
    KEY RESULTS: Overall, 799,522 ED visits were available for analysis. The early and short-term mortality rates were 0.6% and 2.5%, respectively. Models trained on the full set of features yielded an AUC of 0.962 for early mortality and 0.923 for short-term mortality. A model that utilized the nine features with the highest single-variable AUC scores (age, arrival mode, chief complaint, five primary vital signs, and ESI) yielded an AUC of 0.962 for early mortality.
    CONCLUSION: The gradient boosting model shows high predictive ability for screening patients at risk of early mortality utilizing data available at the time of triage in the ED.
    Keywords:  early mortality; emergency department; gradient boosting; machine learning; triage
    DOI:  https://doi.org/10.1007/s11606-019-05512-7
  4. J Med Internet Res. 2019 Oct 29. 21(10): e14658
       BACKGROUND: The ability of nursing undergraduates to communicate effectively with health care providers, patients, and their family members is crucial to their nursing professions as these can affect patient outcomes. However, the traditional use of didactic lectures for communication skills training is ineffective, and the use of standardized patients is not time- or cost-effective. Given the abilities of virtual patients (VPs) to simulate interactive and authentic clinical scenarios in secured environments with unlimited training attempts, a virtual counseling application is an ideal platform for nursing students to hone their communication skills before their clinical postings.
    OBJECTIVE: The aim of this study was to develop and test the use of VPs to better prepare nursing undergraduates for communicating with real-life patients, their family members, and other health care professionals during their clinical postings.
    METHODS: The stages of the creation of VPs included preparation, design, and development, followed by a testing phase before the official implementation. An initial voice chatbot was trained using a natural language processing engine, Google Cloud's Dialogflow, and was later visualized into a three-dimensional (3D) avatar form using Unity 3D.
    RESULTS: The VPs included four case scenarios that were congruent with the nursing undergraduates' semesters' learning objectives: (1) assessing the pain experienced by a pregnant woman, (2) taking the history of a depressed patient, (3) escalating a bleeding episode of a postoperative patient to a physician, and (4) showing empathy to a stressed-out fellow final-year nursing student. Challenges arose in terms of content development, technological limitations, and expectations management, which can be resolved by contingency planning, open communication, constant program updates, refinement, and training.
    CONCLUSIONS: The creation of VPs to assist in nursing students' communication skills training may provide authentic learning environments that enhance students' perceived self-efficacy and confidence in effective communication skills. However, given the infancy stage of this project, further refinement and constant enhancements are needed to train the VPs to simulate real-life conversations before the official implementation.
    Keywords:  artificial intelligence; communication; learning; nursing education; patients; technology; virtual reality
    DOI:  https://doi.org/10.2196/14658
  5. PLoS One. 2019 ;14(10): e0223404
      Discovering technology opportunities from the opinion of users can promote successful technological development by satisfying the needs of users. However, although previous approaches using opinion mining only have classified various needs of users into positive or negative categories, they cannot derive the main reasons for their opinion. To solve this problem, this research proposes an approach to exploring technology opportunity by structuring user needs with a concept of opinion trigger of objects and functions of the technology-based products. To discover technology opportunity, first, an opinion trigger is identified from review data using Naïve Base classifier and natural language processing. Second, the opinion triggers and patent keywords that have a similar meaning in context are clustered to discover the needs of the user and need-related technology. Then, the sentimental values of needs are calculated through graph-based semi-supervised learning. Finally, the needs of the user are classified in resolving the problem of vacant technology to discover technology opportunity. Then, an R&D strategy of each opportunity is suggested based on opinion triggers, patent keywords, and their property. Based on the concept of opinion trigger-based methodology, a case study is conducted on automobile-related reviews, extracting the customer needs and presenting important R&D projects such as an extracted need (cargo transportation) and its R&D strategy (resolving contradiction). The proposed approach can analyze the needs of user at a functional level to discover new technology opportunities.
    DOI:  https://doi.org/10.1371/journal.pone.0223404
  6. J Med Life. 2019 Jul-Sep;12(3):12(3): 203-214
      This is a narrative review of telemonitoring (remote monitoring) projects and studies within the field of diabetes, with a focus on results of the more recent studies. Since the beginning of the 1990s, several telemedicine projects and studies focused on type 1 and type 2 diabetes. Over the last 5 years, numerous telemedicine projects based on connected objects and new information and communication technologies (ICT) (elements defining telemedicine 2.0) have emerged or are still under development. Two examples are the DIABETe and Telesage telemonitoring project which perfectly fits within the telemedicine 2.0 framework - the first to include artificial intelligence (AI) with MyPrediTM and DiabeoTM. Mainly, these projects and studies show that telemonitoring diabetic result in: improvements in control of blood glucose (BG) level and significant reduction in HbA1c (e.g., for Telescot et TELESAGE studies); positive impact on co-morbidities (arterial hypertension, weight, dyslipidemia) (e.g., for Telescot and DIABETe studies); better patient's quality of life (e.g., for DIABETe study); positive impact on appropriation of the disease by patients and/or greater adherence to therapeutic and hygiene-dietary measures (e.g., The Utah Remote Monitoring Project); and at least, good receptiveness by patients and their empowerment. To date, the magnitude of its effects remains debatable, especially with the variation in patients' characteristics (e.g., background, ability for self-management, medical condition), samples selection and approach for the treatment of control groups. All of the recent studies have been classified as "Moderate" to "High".
    Keywords:  Internet; Web; artificial intelligence; chronic disease; diabetes; information and communication technology; telemedicine; telemonitoring
    DOI:  https://doi.org/10.25122/jml-2019-0006
  7. Eur J Nucl Med Mol Imaging. 2019 Oct 31.
      Artificial intelligence involves a wide range of smart techniques that are applicable to medical services including nuclear medicine. Recent advances in computer power, availability of accumulated digital archives containing large amount of patient images, and records bring new opportunities for the implementation of artificial techniques in nuclear medicine. As a subset of artificial intelligence, machine learning is an emerging tool that can possibly perform many clinical tasks. Nuclear medicine community needs to adapt to this fast approaching smart era, to exploit the opportunities and tackle the problems associated with artificial intelligence tools. It is aimed in this editorial to outline the opportunities and challenges of artificial intelligence applications in nuclear medicine.
    Keywords:  Artificial intelligence; Artificial neural networks; Deep learning; Machine learning; Radiomics; Supervised learning; Unsupervised learning
    DOI:  https://doi.org/10.1007/s00259-019-04593-0
  8. Hematol Oncol Clin North Am. 2019 Dec;pii: S0889-8588(19)30096-6. [Epub ahead of print]33(6): 1095-1104
      The integration of artificial intelligence in the radiation oncologist's workflow has multiple applications and significant potential. From the initial patient encounter, artificial intelligence may aid in pretreatment disease outcome and toxicity prediction. It may subsequently aid in treatment planning, and enhanced dose optimization. Artificial intelligence may also optimize the quality assurance process and support a higher level of safety, quality, and efficiency of care. This article describes components of the radiation consultation, planning, and treatment process and how the thoughtful integration of artificial intelligence may improve shared decision making, planning efficiency, planning quality, patient safety, and patient outcomes.
    Keywords:  Artificial intelligence; Deep learning; Machine learning
    DOI:  https://doi.org/10.1016/j.hoc.2019.08.003
  9. AJR Am J Roentgenol. 2019 Oct 31. 1-7
      OBJECTIVE. The recent advancement of deep learning techniques has profoundly impacted research on quantitative cardiac MRI analysis. The purpose of this article is to introduce the concept of deep learning, review its current applications on quantitative cardiac MRI, and discuss its limitations and challenges. CONCLUSION. Deep learning has shown state-of-the-art performance on quantitative analysis of multiple cardiac MRI sequences and holds great promise for future use in clinical practice and scientific research.
    Keywords:  artificial intelligence; cardiac MRI; deep learning; quantitative MRI
    DOI:  https://doi.org/10.2214/AJR.19.21927
  10. JMIR Form Res. 2019 Oct 29. 3(4): e13863
       BACKGROUND: Health apps for the screening and diagnosis of mental disorders have emerged in recent years on various levels (eg, patients, practitioners, and public health system). However, the diagnostic quality of these apps has not been (sufficiently) tested so far.
    OBJECTIVE: The objective of this pilot study was to investigate the diagnostic quality of a health app for a broad spectrum of mental disorders and its dependency on expert knowledge.
    METHODS: Two psychotherapists, two psychology students, and two laypersons each read 20 case vignettes with a broad spectrum of mental disorders. They used a health app (Ada-Your Health Guide) to get a diagnosis by entering the symptoms. Interrater reliabilities were computed between the diagnoses of the case vignettes and the results of the app for each user group.
    RESULTS: Overall, there was a moderate diagnostic agreement (kappa=0.64) between the results of the app and the case vignettes for mental disorders in adulthood and a low diagnostic agreement (kappa=0.40) for mental disorders in childhood and adolescence. When psychotherapists applied the app, there was a good diagnostic agreement (kappa=0.78) regarding mental disorders in adulthood. The diagnostic agreement was moderate (kappa=0.55/0.60) for students and laypersons. For mental disorders in childhood and adolescence, a moderate diagnostic quality was found when psychotherapists (kappa=0.53) and students (kappa=0.41) used the app, whereas the quality was low for laypersons (kappa=0.29). On average, the app required 34 questions to be answered and 7 min to complete.
    CONCLUSIONS: The health app investigated here can represent an efficient diagnostic screening or help function for mental disorders in adulthood and has the potential to support especially diagnosticians in their work in various ways. The results of this pilot study provide a first indication that the diagnostic accuracy is user dependent and improvements in the app are needed especially for mental disorders in childhood and adolescence.
    Keywords:  (mobile) app; artificial intelligence; diagnostic; eHealth; mHealth; mental disorders; screening
    DOI:  https://doi.org/10.2196/13863
  11. Psychiatr Clin North Am. 2019 Dec;pii: S0193-953X(19)30077-2. [Epub ahead of print]42(4): 627-634
      The goal of automating complex human activities dates to antiquity. The mental health field has also made use of advances in technology to assist patients in need. Artificial Intelligence (AI) is the study of agents that receive percepts from the environment and perform actions. AI is increasingly being incorporated into the development of chatbots that can be deployed in both clinical and nonclinical settings. Chatbots are a computer program that simulates human conversation through voice commands or text chats or both. The collaboration between AI therapists and more traditional providers of such care will only grow.
    Keywords:  AI; Artificial intelligence; Bot; Chatbot; Mental health
    DOI:  https://doi.org/10.1016/j.psc.2019.08.007
  12. Psychiatr Clin North Am. 2019 Dec;pii: S0193-953X(19)30070-X. [Epub ahead of print]42(4): 597-609
      Self-help and automated technologies can be useful for behavioral and mental health education and interventions. These technologies include interactive media, online courses, artificial intelligence-powered chatbots, voice assistants, and video games. Self-help media can include books, videos, audible media like podcasts, blog and print articles, and self-contained Internet sites. Social media, online courses, and mass-market mobile apps also can include such media. These technologies serve to decrease geospatial, temporal, and financial barriers. This article describes different self-help and automated technologies, how to implement such technologies in existing clinical services, and how to implement according to patient needs.
    Keywords:  Chatbots; Education; Media; Mental health; Smartphone; Video games; Voice assistants; Websites
    DOI:  https://doi.org/10.1016/j.psc.2019.07.001
  13. JCO Precis Oncol. 2019 ;3
       PURPOSE: We developed an unbiased framework to study the association of several mutations in predicting resistance to hypomethylating agents (HMAs) in patients with myelodysplastic syndromes (MDS), analogous to consumer and commercial recommender systems in which customers who bought products A and B are likely to buy C: patients who have a mutation in gene A and gene B are likely to respond or not respond to HMAs.
    METHODS: We screened a cohort of 433 patients with MDS who received HMAs for the presence of common myeloid mutations in 29 genes that were obtained before the patients started therapy. The association between mutations and response was evaluated by the Apriori market basket analysis algorithm. Rules with the highest confidence (confidence that the association exists) and the highest lift (strength of the association) were chosen. We validated our biomarkers in samples from patients enrolled in the S1117 trial.
    RESULTS: Among 433 patients, 193 (45%) received azacitidine, 176 (40%) received decitabine, and 64 (15%) received HMA alone or in combination. The median age was 70 years (range, 31 to 100 years), and 28% were female. The median number of mutations per sample was three (range, zero to nine), and 176 patients (41%) had three or more mutations per sample. Association rules identified several genomic combinations as being highly associated with no response. These molecular signatures were present in 30% of patients with three or more mutations/sample with an accuracy rate of 87% in the training cohort and 93% in the validation cohort.
    CONCLUSION: Genomic biomarkers can identify, with high accuracy, approximately one third of patients with MDS who will not respond to HMAs. This study highlights the importance of machine learning technologies such as the recommender system algorithm in translating genomic data into useful clinical tools.
    DOI:  https://doi.org/10.1200/po.19.00119
  14. J Infect Public Health. 2019 Oct 24. pii: S1876-0341(19)30305-3. [Epub ahead of print]
      Glaucoma is a major cause of blindness. Most patients start to observe that late after the disease causes a high level of damage in the optic nerve head and the high percentage of vision loss. Early diagnosis and treatment are essential and must be taken. Image processing mass-screening and machine learning classification can support early and automatic diagnosis of Glaucoma symptoms so as to take protective measures and to extend symptom-free life of patients. This paper proposes improved techniques to extract disease-related and image-based features. Support Vector Machines and Genetically-Optimized Artificial Neural Networks, pronounced machine learning algorithms, are fine-tuned to combine the two set of features in one automated image classification system. The proposed methodology was applied to a dataset of 106 retina images obtained from three hospitals. The proposed system automatically detected Glaucoma using Support Vector Machines technique with 100% specificity and 87% accuracy. Artificial Neural Network classified the images with 98% accuracy.
    Keywords:  Artificial Neural Networks; Diagnosis; Genetic Algorithms; Glaucoma; Support Vector Machines
    DOI:  https://doi.org/10.1016/j.jiph.2019.09.005