bims-arihec Biomed News
on Artificial intelligence in healthcare
Issue of 2020–02–09
twenty-one papers selected by
Céline Bélanger, Cogniges Inc.



  1. Eur Radiol Exp. 2020 Feb 07. 4(1): 11
      Radiomics, artificial intelligence, and deep learning figure amongst recent buzzwords in current medical imaging research and technological development. Analysis of medical big data in assessment and follow-up of personalised treatments has also become a major research topic in the area of precision medicine. In this review, current research trends in radiomics are analysed, from handcrafted radiomics feature extraction and statistical analysis to deep learning. Radiomics algorithms now include genomics and immunomics data to improve patient stratification and prediction of treatment response. Several applications have already shown conclusive results demonstrating the potential of including other "omics" data to existing imaging features. We also discuss further challenges of data harmonisation and management infrastructure to shed a light on the much-needed integration of radiomics and all other "omics" into clinical workflows. In particular, we point to the emerging paradigm shift in the implementation of big data infrastructures to facilitate databanks growth, data extraction and the development of expert software tools. Secured access, sharing, and integration of all health data, called "holomics", will accelerate the revolution of personalised medicine and oncology as well as expand the role of imaging specialists.
    Keywords:  Artificial intelligence; Holomics; Machine learning; Precision medicine; Radiomics
    DOI:  https://doi.org/10.1186/s41747-019-0143-0
  2. Acad Radiol. 2020 Jan 31. pii: S1076-6332(20)30003-9. [Epub ahead of print]
       RATIONALE AND OBJECTIVES: This study aimed to investigate whether benign and malignant renal solid masses could be distinguished through machine learning (ML)-based computed tomography (CT) texture analysis.
    MATERIALS AND METHODS: Seventy-nine patients with 84 solid renal masses (21 benign; 63 malignant) from a single center were included in this retrospective study. Malignant masses included common renal cell carcinoma (RCC) subtypes: clear cell RCC, papillary cell RCC, and chromophobe RCC. Benign masses are represented by oncocytomas and fat-poor angiomyolipomas. Following preprocessing steps, a total of 271 texture features were extracted from unenhanced and contrast-enhanced CT images. Dimension reduction was done with a reliability analysis and then with a feature selection algorithm. A nested-approach was used for feature selection, model optimization, and validation. Eight ML algorithms were used for the classifications: decision tree, locally weighted learning, k-nearest neighbors, naive Bayes, logistic regression, support vector machine, neural network, and random forest.
    RESULTS: The number of features with good reproducibility was 198 for unenhanced CT and 244 for contrast-enhanced CT. Random forest algorithm demonstrated the best predictive performance using five selected contrast-enhanced CT texture features. The accuracy and area under the curve metrics were 90.5% and 0.915, respectively. Having eliminated the highly collinear features from the analysis, the accuracy and area under the curve values slightly increased to 91.7% and 0.916, respectively.
    CONCLUSION: ML-based contrast-enhanced CT texture analysis might be a potential method for distinguishing benign and malignant solid renal masses with satisfactory performance.
    Keywords:  Artificial intelligence; Machine learning; Radiomics; Renal mass; Texture analysis
    DOI:  https://doi.org/10.1016/j.acra.2019.12.015
  3. Ther Innov Regul Sci. 2020 Jan;54(1): 69-74
       BACKGROUND: Delays in clinical trial enrollment and difficulties enrolling representative samples continue to vex sponsors, sites, and patient populations. Here we investigated use of an artificial intelligence-powered technology, Mendel.ai, as a means of overcoming bottlenecks and potential biases associated with standard patient prescreening processes in an oncology setting.
    METHODS: Mendel.ai was applied retroactively to 2 completed oncology studies (1 breast, 1 lung), and 1 study that failed to enroll (lung), at the Comprehensive Blood and Cancer Center, allowing direct comparison between results achieved using standard prescreening practices and results achieved with Mendel.ai. Outcome variables included the number of patients identified as potentially eligible and the elapsed time between eligibility and identification.
    RESULTS: For each trial that enrolled, use of Mendel.ai resulted in a 24% to 50% increase over standard practices in the number of patients correctly identified as potentially eligible. No patients correctly identified by standard practices were missed by Mendel.ai. For the nonenrolling trial, both approaches failed to identify suitable patients. An average of 19 days for breast and 263 days for lung cancer patients elapsed between actual patient eligibility (based on clinical chart information) and identification when the standard prescreening practice was used. In contrast, ascertainment of potential eligibility using Mendel.ai took minutes.
    CONCLUSIONS: This study suggests that augmentation of human resources with artificial intelligence could yield sizable improvements over standard practices in several aspects of the patient prescreening process, as well as in approaches to feasibility, site selection, and trial selection.
    Keywords:  artificial intelligence; clinical trial enrollment; clinical trial screening; clinical trial startup; feasibility; machine learning; real world data
    DOI:  https://doi.org/10.1007/s43441-019-00030-4
  4. Eur J Cardiothorac Surg. 2020 Feb 03. pii: ezaa011. [Epub ahead of print]
       OBJECTIVES: As evidence has proven that sublobar resection is oncologically contraindicated by tumour spread through air spaces (STAS), its preoperative recognition is vital in customizing surgical strategies. We aimed to assess the value of radiomics in predicting STAS in stage I lung adenocarcinoma.
    METHODS: We retrospectively reviewed the patients with stage I lung adenocarcinoma, who accepted curative resection in our institution between January 2011 and December 2013. Using 'PyRadiomics' package, 88 radiomics features were extracted from computed tomography (CT) images and a prediction model was consequently constructed using Naïve Bayes machine-learning approach. The accuracy of the model was assessed through receiver operating curve analysis, and the performance of the model was validated both internally and externally.
    RESULTS: A total of 233 patients were included as the training cohort with 69 (29.6%) patients being STAS (+). Patients with STAS had worse recurrence-free survival and overall survival (P < 0.001). After feature extraction, 5 most contributing radiomics features were selected out to develop a Naïve Bayes model. In the internal validation, the model exhibited good performance with an area under the curve value of 0.63 (0.55-0.71). External validation was conducted on a test cohort with 112 patients and produced an area under the curve value of 0.69.
    CONCLUSIONS: CT-based radiomics is valuable in preoperatively predicting STAS in stage I lung adenocarcinoma, which may aid surgeons in determining the optimal surgical approach.
    Keywords:  Lung cancer; Radiomics; Spread through air spaces; Surgery
    DOI:  https://doi.org/10.1093/ejcts/ezaa011
  5. J Natl Cancer Inst. 2020 Feb 04. pii: djaa017. [Epub ahead of print]
       BACKGROUND: To forecast survival and enhance treatment decisions for patients with colorectal cancer liver metastases (mCRC) by using on-treatment radiomics signature to predict tumor sensitiveness to FOLFIRI±cetuximab.
    METHODS: We retrospectively analyzed 667 mCRC patients treated with FOLFIRI alone [F] or in combination with cetuximab [FC]. CT quality was classified as high (HQ) or standard (SD). Four datasets were created using the nomenclature [treatment]-[quality]. Patients were randomly assigned (2:1) to training or validation sets: FCHQ: 78:38, FCSD: 124:62, FHQ: 78:51, FSD: 158:78. Four tumor imaging biomarkers measured quantitative radiomics changes between standard of care CT scans at baseline and 8 weeks. Using machine learning, the performance of the signature to classify tumors as treatment-sensitive or treatment-insensitive was trained and validated using ROC curves. Hazard Ratio (HR) and Cox Regression models evaluated association with overall survival (OS).
    RESULTS: The signature (AUC[95CI]) used temporal decrease in tumor spatial heterogeneity plus boundary infiltration to successfully predict sensitivity to anti-EGFR therapy (FCHQ: 0.80 [0.69-0.94], FCSD: 0.72 [0.59-0.83]) but failed with chemotherapy (FHQ: 0.59 [0.44-0.72], FSD: 0.55 [0.43-0.66]). In cetuximab-containing sets, radiomics signature outperformed existing biomarkers (KRAS-mutational status, and tumor shrinkage by RECIST 1.1) for detection of treatment-sensitivity and was strongly associated with OS (two-sided P < 0.005).
    CONCLUSIONS: Radiomics response signature can serve as an intermediate surrogate marker of overall survival. The signature outperformed known biomarkers in providing an early prediction of treatment-sensitivity and could be used to guide cetuximab treatment continuation decisions.
    Keywords:  Artificial intelligence; Cetuximab; Colorectal Cancer; FOLFIRI; RAS; deep-learning; machine-learning; radiomics
    DOI:  https://doi.org/10.1093/jnci/djaa017
  6. Neuroimage Clin. 2020 Jan 23. pii: S2213-1582(20)30011-5. [Epub ahead of print]25 102172
      The imaging and subsequent accurate diagnosis of paediatric brain tumours presents a radiological challenge, with magnetic resonance imaging playing a key role in providing tumour specific imaging information. Diffusion weighted and perfusion imaging are commonly used to aid the non-invasive diagnosis of children's brain tumours, but are usually evaluated by expert qualitative review. Quantitative studies are mainly single centre and single modality. The aim of this work was to combine multi-centre diffusion and perfusion imaging, with machine learning, to develop machine learning based classifiers to discriminate between three common paediatric tumour types. The results show that diffusion and perfusion weighted imaging of both the tumour and whole brain provide significant features which differ between tumour types, and that combining these features gives the optimal machine learning classifier with >80% predictive precision. This work represents a step forward to aid in the non-invasive diagnosis of paediatric brain tumours, using advanced clinical imaging.
    Keywords:  Diffusion; Machine learning; Perfusion
    DOI:  https://doi.org/10.1016/j.nicl.2020.102172
  7. Cancer Cytopathol. 2020 Feb 03.
       BACKGROUND: The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) comprises 6 categories used for the diagnosis of thyroid fine-needle aspiration biopsy (FNAB). Each category has an associated risk of malignancy, which is important in the management of a thyroid nodule. More accurate predictions of malignancy may help to reduce unnecessary surgery. A machine learning algorithm (MLA) was developed to evaluate thyroid FNAB via whole slide images (WSIs) to predict malignancy.
    METHODS: Files were searched for all thyroidectomy specimens with preceding FNAB over 8 years. All cytologic and surgical pathology diagnoses were recorded and correlated for each nodule. One representative slide from each case was scanned to create a WSI. An MLA was designed to identify follicular cells and predict the malignancy of the final pathology. The test set comprised cases blindly reviewed by a cytopathologist who assigned a TBSRTC category. The area under the receiver operating characteristic curve was used to assess the MLA performance.
    RESULTS: Nine hundred eight FNABs met the criteria. The MLA predicted malignancy with a sensitivity and specificity of 92.0% and 90.5%, respectively. The areas under the curve for the prediction of malignancy by the cytopathologist and the MLA were 0.931 and 0.932, respectively.
    CONCLUSIONS: The performance of the MLA in predicting thyroid malignancy from FNAB WSIs is comparable to the performance of an expert cytopathologist. When the MLA and electronic medical record diagnoses are combined, the performance is superior to the performance of either alone. An MLA may be used as an adjunct to FNAB to assist in refining the indeterminate categories.
    Keywords:  Bethesda System for Reporting Thyroid Cytopathology; machine learning; malignancy prediction; neural network; thyroid fine-needle aspiration (FNA)
    DOI:  https://doi.org/10.1002/cncy.22238
  8. JAMIA Open. 2019 Dec;2(4): 528-537
       Objectives: Most population-based cancer databases lack information on metastatic recurrence. Electronic medical records (EMR) and cancer registries contain complementary information on cancer diagnosis, treatment and outcome, yet are rarely used synergistically. To construct a cohort of metastatic breast cancer (MBC) patients, we applied natural language processing techniques within a semisupervised machine learning framework to linked EMR-California Cancer Registry (CCR) data.
    Materials and Methods: We studied all female patients treated at Stanford Health Care with an incident breast cancer diagnosis from 2000 to 2014. Our database consisted of structured fields and unstructured free-text clinical notes from EMR, linked to CCR, a component of the Surveillance, Epidemiology and End Results Program (SEER). We identified de novo MBC patients from CCR and extracted information on distant recurrences from patient notes in EMR. Furthermore, we trained a regularized logistic regression model for recurrent MBC classification and evaluated its performance on a gold standard set of 146 patients.
    Results: There were 11 459 breast cancer patients in total and the median follow-up time was 96.3 months. We identified 1886 MBC patients, 512 (27.1%) of whom were de novo MBC patients and 1374 (72.9%) were recurrent MBC patients. Our final MBC classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.917, with sensitivity 0.861, specificity 0.878, and accuracy 0.870.
    Discussion and Conclusion: To enable population-based research on MBC, we developed a framework for retrospective case detection combining EMR and CCR data. Our classifier achieved good AUC, sensitivity, and specificity without expert-labeled examples.
    Keywords:  SEER; cancer distant recurrence; electronic medical records; natural language processing; semi-supervised machine learning
    DOI:  https://doi.org/10.1093/jamiaopen/ooz040
  9. Eur J Prev Cardiol. 2020 Feb 04. 2047487319898951
       AIMS: Familial hypercholesterolemia (FH) is the most common genetic disorder of lipid metabolism. The gold standard for FH diagnosis is genetic testing, available, however, only in selected university hospitals. Clinical scores - for example, the Dutch Lipid Score - are often employed as alternative, more accessible, albeit less accurate FH diagnostic tools. The aim of this study is to obtain a more reliable approach to FH diagnosis by a "virtual" genetic test using machine-learning approaches.
    METHODS AND RESULTS: We used three machine-learning algorithms (a classification tree (CT), a gradient boosting machine (GBM), a neural network (NN)) to predict the presence of FH-causative genetic mutations in two independent FH cohorts: the FH Gothenburg cohort (split into training data (N = 174) and internal test (N = 74)) and the FH-CEGP Milan cohort (external test, N = 364). By evaluating their area under the receiver operating characteristic (AUROC) curves, we found that the three machine-learning algorithms performed better (AUROC 0.79 (CT), 0.83 (GBM), and 0.83 (NN) on the Gothenburg cohort, and 0.70 (CT), 0.78 (GBM), and 0.76 (NN) on the Milan cohort) than the clinical Dutch Lipid Score (AUROC 0.68 and 0.64 on the Gothenburg and Milan cohorts, respectively) in predicting carriers of FH-causative mutations.
    CONCLUSION: In the diagnosis of FH-causative genetic mutations, all three machine-learning approaches we have tested outperform the Dutch Lipid Score, which is the clinical standard. We expect these machine-learning algorithms to provide the tools to implement a virtual genetic test of FH. These tools might prove particularly important for lipid clinics without access to genetic testing.
    Keywords:  Familial hypercholesterolemia; cardiovascular disease; dyslipidemia; machine learning; prediction model
    DOI:  https://doi.org/10.1177/2047487319898951
  10. Diabetes Ther. 2020 Feb 03.
       INTRODUCTION: To identify predictors of hypoglycemia and five other clinical and economic outcomes among treated patients with type 2 diabetes (T2D) using machine learning and structured data from a large, geographically diverse administrative claims database.
    METHODS: A retrospective cohort study design was applied to Optum Clinformatics claims data indexed on first antidiabetic prescription date. A hypothesis-free, Bayesian machine learning analytics platform (GNS Healthcare REFS™: Reverse Engineering and Forward Simulation) was used to build ensembles of generalized linear models to predict six outcomes defined in patients' 1-year post-index claims history, including hypoglycemia, antidiabetic class persistence, glycated hemoglobin (HbA1c) target attainment, HbA1c change, T2D-related inpatient admissions, and T2D-related medical costs. A unified set of 388 variables defined in patients' 1-year pre-index claims history constituted the set of predictors for all REFS models.
    RESULTS: The derivation cohort comprised 453,487 patients with a T2D diagnosis between 2014 and 2017. Patients with comorbid conditions had the highest risk of hypoglycemia, including those with prior hypoglycemia (odds ratio [OR] = 25.61) and anemia (OR = 1.29). Other identified risk factors included insulin (OR = 2.84) and sulfonylurea use (OR = 1.80). Biguanide use (OR = 0.75), high blood glucose (> 125 mg/dL vs. < 100 mg/dL, OR = 0.47; 100-125 mg/dL vs. < 100 mg/dL, OR = 0.53), and missing blood glucose test (OR = 0.40) were associated with reduced risk of hypoglycemia. Area under the curve (AUC) of the hypoglycemia model in held-out testing data was 0.77. Patients in the top 15% of predicted hypoglycemia risk constituted 50% of observed hypoglycemic events, 26% of T2D-related inpatient admissions, and 24% of all T2D-related medical costs.
    CONCLUSIONS: Machine learning models built within high-dimensional, real-world data can predict patients at risk of clinical outcomes with a high degree of accuracy, while uncovering important factors associated with outcomes that can guide clinical practice. Targeted interventions towards these patients may help reduce hypoglycemia risk and thereby favorably impact associated economic outcomes relevant to key stakeholders.
    Keywords:  Healthcare costs; Hypoglycemia; Machine learning; Resource utilization; Type 2 diabetes; Value-based
    DOI:  https://doi.org/10.1007/s13300-020-00759-4
  11. Diabetes Care. 2020 Feb 06. pii: dc192057. [Epub ahead of print]
       OBJECTIVE: To construct and internally validate prediction models to estimate the risk of long-term end-organ complications and mortality in patients with type 2 diabetes and obesity that can be used to inform treatment decisions for patients and practitioners who are considering metabolic surgery.
    RESEARCH DESIGN AND METHODS: A total of 2,287 patients with type 2 diabetes who underwent metabolic surgery between 1998 and 2017 in the Cleveland Clinic Health System were propensity-matched 1:5 to 11,435 nonsurgical patients with BMI ≥30 kg/m2 and type 2 diabetes who received usual care with follow-up through December 2018. Multivariable time-to-event regression and random forest machine learning models were built and internally validated using fivefold cross-validation to predict the 10-year risk for four outcomes of interest. The prediction models were programmed to construct user-friendly web-based and smartphone applications of Individualized Diabetes Complications (IDC) Risk Scores for clinical use.
    RESULTS: The prediction tools demonstrated the following discrimination ability based on the area under the receiver operating characteristic curve (1 = perfect discrimination and 0.5 = chance) at 10 years in the surgical and nonsurgical groups, respectively: all-cause mortality (0.79 and 0.81), coronary artery events (0.66 and 0.67), heart failure (0.73 and 0.75), and nephropathy (0.73 and 0.76). When a patient's data are entered into the IDC application, it estimates the individualized 10-year morbidity and mortality risks with and without undergoing metabolic surgery.
    CONCLUSIONS: The IDC Risk Scores can provide personalized evidence-based risk information for patients with type 2 diabetes and obesity about future cardiovascular outcomes and mortality with and without metabolic surgery based on their current status of obesity, diabetes, and related cardiometabolic conditions.
    DOI:  https://doi.org/10.2337/dc19-2057
  12. JMIR Med Inform. 2020 Jan 31. 8(1): e15510
       BACKGROUND: Artificial intelligence-enabled electronic health record (EHR) analysis can revolutionize medical practice from the diagnosis and prediction of complex diseases to making recommendations in patient care, especially for chronic conditions such as chronic kidney disease (CKD), which is one of the most frequent complications in patients with diabetes and is associated with substantial morbidity and mortality.
    OBJECTIVE: The longitudinal prediction of health outcomes requires effective representation of temporal data in the EHR. In this study, we proposed a novel temporal-enhanced gradient boosting machine (GBM) model that dynamically updates and ensembles learners based on new events in patient timelines to improve the prediction accuracy of CKD among patients with diabetes.
    METHODS: Using a broad spectrum of deidentified EHR data on a retrospective cohort of 14,039 adult patients with type 2 diabetes and GBM as the base learner, we validated our proposed Landmark-Boosting model against three state-of-the-art temporal models for rolling predictions of 1-year CKD risk.
    RESULTS: The proposed model uniformly outperformed other models, achieving an area under receiver operating curve of 0.83 (95% CI 0.76-0.85), 0.78 (95% CI 0.75-0.82), and 0.82 (95% CI 0.78-0.86) in predicting CKD risk with automatic accumulation of new data in later years (years 2, 3, and 4 since diabetes mellitus onset, respectively). The Landmark-Boosting model also maintained the best calibration across moderate- and high-risk groups and over time. The experimental results demonstrated that the proposed temporal model can not only accurately predict 1-year CKD risk but also improve performance over time with additionally accumulated data, which is essential for clinical use to improve renal management of patients with diabetes.
    CONCLUSIONS: Incorporation of temporal information in EHR data can significantly improve predictive model performance and will particularly benefit patients who follow-up with their physicians as recommended.
    Keywords:  chronic kidney disease; diabetic kidney disease; diabetic nephropathy; machine learning
    DOI:  https://doi.org/10.2196/15510
  13. Br J Radiol. 2020 Feb 06. 20190812
      In this review, we describe the technical aspects of artificial intelligence (AI) in cardiac imaging, starting with radiomics, basic algorithms of deep learning and application tasks of algorithms, until recently the availability of the public database. Subsequently, we conducted a systematic literature search for recently published clinically relevant studies on AI in cardiac imaging. As a result, 24 and 14 studies using CT and MRI, respectively, were included and summarized. From these studies, it can be concluded that AI is widely applied in cardiac applications in the clinic, including coronary calcium scoring, coronary CT angiography, fractional flow reserve CT, plaque analysis, left ventricular myocardium analysis, diagnosis of myocardial infarction, prognosis of coronary artery disease, assessment of cardiac function, and diagnosis and prognosis of cardiomyopathy. These advancements show that AI has a promising prospect in cardiac imaging.
    DOI:  https://doi.org/10.1259/bjr.20190812
  14. J Matern Fetal Neonatal Med. 2020 Feb 04. 1-8
      Background: Advances in omics and computational Artificial Intelligence (AI) have been said to be key to meeting the objectives of precision cardiovascular medicine. The focus of precision medicine includes a better assessment of disease risk and understanding of disease mechanisms. Our objective was to determine whether significant epigenetic changes occur in isolated, non-syndromic CoA. Further, we evaluated the AI analysis of DNA methylation for the prediction of CoA.Methods: Genome-wide DNA methylation analysis of newborn blood DNA was performed in 24 isolated, non-syndromic CoA cases and 16 controls using the Illumina HumanMethylation450 BeadChip arrays. Cytosine nucleotide (CpG) methylation changes in CoA in each of 450,000 CpG loci were determined. Ingenuity pathway analysis (IPA) was performed to identify molecular and disease pathways that were epigenetically dysregulated. Using methylation data, six artificial intelligence (AI) platforms including deep learning (DL) was used for CoA detection.Results: We identified significant (FDR p-value ≤ .05) methylation changes in 65 different CpG sites located in 75 genes in CoA subjects. DL achieved an AUC (95% CI) = 0.97 (0.80-1) with 95% sensitivity and 98% specificity. Gene ontology (GO) analysis yielded epigenetic alterations in important cardiovascular developmental genes and biological processes: abnormal morphology of cardiovascular system, left ventricular dysfunction, heart conduction disorder, thrombus formation, and coronary artery disease.Conclusion: In an exploratory study we report the use of AI and epigenomics to achieve important objectives of precision cardiovascular medicine. Accurate prediction of CoA was achieved using a newborn blood spot. Further, we provided evidence of a significant epigenetic etiology in isolated CoA development.
    Keywords:  Artificial intelligence; DNA methylation; congenital heart defect; deep learning; epigenetics
    DOI:  https://doi.org/10.1080/14767058.2020.1722995
  15. Am J Obstet Gynecol. 2020 Jan 30. pii: S0002-9378(20)30001-6. [Epub ahead of print]
       BACKGROUND: Efforts to reduce cesarean delivery rates to 12-15% have been undertaken worldwide. Special focus has been directed towards parturients undergoing a trial of labor after cesarean delivery, in order to reduce the burden of repeated cesarean deliveries. Complication rates are lowest when a vaginal birth is achieved and highest when an unplanned cesarean is performed, emphasizing the need to assess in advance the likelihood of a successful vaginal birth after cesarean. Vaginal birth after cesarean delivery calculators were developed in different populations, however, some limitations to their implementation into clinical practice were described. Machine learning methods enable investigation of large-scale datasets with input combinations that traditional statistical analysis tools have difficulty processing.
    OBJECTIVE: The aim of this study was to evaluate the feasibility of using machine-learning methods to predict a successful vaginal birth after cesarean delivery.
    STUDY DESIGN: The electronic medical records of singleton, term labors during a 12-year period in a tertiary referral center were analyzed. Using gradient boosting, models incorporating multiple maternal and fetal features were created to predict successful vaginal birth in parturients undergoing a trial of labor after cesarean delivery. One model was created to provide a personalized risk score for vaginal birth after cesarean delivery using features available as early as the first antenatal visit, additionally, a second model was created that reassesses this score after adding features available only in proximity to delivery.
    RESULTS: A cohort of 9,888 parturients with one previous cesarean delivery was identified, in which 75.6% (n=7,473) of parturients attempted a trial of labor, with a success rate of 88%. A machine learning based model to predict when vaginal delivery would be successful was developed. When using features available at the first antenatal visit, the model showed a receiver operating characteristic curve with area under the curve of 0.745 (95% confidence interval 0.728-0.762) which increased to 0.793 (95% confidence interval 0.778-0.808) when features available in proximity to the delivery process were added. Additionally, for the later model, a risk stratification tool was built to allocate parturients into low, medium and high-risk groups for failed trial of labor after cesarean delivery. The low and medium risk groups (42.4% and 25.6% of parturients, respectively) showed a success rate of 97.3% and 90.9% respectively. The high-risk group (32.1%) had a vaginal delivery success rate of 73.3%. Applying the model to a cohort of parturients who elected a repeat cesarean delivery (n=2145) demonstrated that 31% of these parturients would have been allocated to the low- and medium risk group, had a trial of labor been attempted.
    CONCLUSION: Trial of labor after cesarean delivery is safe for most parturients. Success rates are high even in a population with high rates of trial of labor after cesarean. Applying a machine learning algorithm to assign a personalized risk score for a successful vaginal birth after cesarean delivery may help in decision making and contribute to a reduction in cesarean delivery rates. Parturient allocation to risk groups may help delivery process management.
    Keywords:  machine learning; obstetrics; personalized medicine; prediction; trial of labor; vaginal birth after cesarean
    DOI:  https://doi.org/10.1016/j.ajog.2019.12.267
  16. Antibiotics (Basel). 2020 Jan 31. pii: E50. [Epub ahead of print]9(2):
      Hospital-acquired infections, particularly in the critical care setting, have become increasingly common during the last decade, with Gram-negative bacterial infections presenting the highest incidence among them. Multi-drug-resistant (MDR) Gram-negative infections are associated with high morbidity and mortality with significant direct and indirect costs resulting from long hospitalization due to antibiotic failure. Time is critical to identifying bacteria and their resistance to antibiotics due to the critical health status of patients in the intensive care unit (ICU). As common antibiotic resistance tests require more than 24 h after the sample is collected to determine sensitivity in specific antibiotics, we suggest applying machine learning (ML) techniques to assist the clinician in determining whether bacteria are resistant to individual antimicrobials by knowing only a sample's Gram stain, site of infection, and patient demographics. In our single center study, we compared the performance of eight machine learning algorithms to assess antibiotic susceptibility predictions. The demographic characteristics of the patients are considered for this study, as well as data from cultures and susceptibility testing. Applying machine learning algorithms to patient antimicrobial susceptibility data, readily available, solely from the Microbiology Laboratory without any of the patient's clinical data, even in resource-limited hospital settings, can provide informative antibiotic susceptibility predictions to aid clinicians in selecting appropriate empirical antibiotic therapy. These strategies, when used as a decision support tool, have the potential to improve empiric therapy selection and reduce the antimicrobial resistance burden.
    Keywords:  ICU; ML techniques; antibiotic resistance; antimicrobial resistance; artificial intelligence; intensive care unit; machine learning; prediction
    DOI:  https://doi.org/10.3390/antibiotics9020050
  17. JAMA Netw Open. 2020 Feb 05. 3(2): e1920733
       Importance: The ability to accurately predict in-hospital mortality for patients at the time of admission could improve clinical and operational decision-making and outcomes. Few of the machine learning models that have been developed to predict in-hospital death are both broadly applicable to all adult patients across a health system and readily implementable. Similarly, few have been implemented, and none have been evaluated prospectively and externally validated.
    Objectives: To prospectively and externally validate a machine learning model that predicts in-hospital mortality for all adult patients at the time of hospital admission and to design the model using commonly available electronic health record data and accessible computational methods.
    Design, Setting, and Participants: In this prognostic study, electronic health record data from a total of 43 180 hospitalizations representing 31 003 unique adult patients admitted to a quaternary academic hospital (hospital A) from October 1, 2014, to December 31, 2015, formed a training and validation cohort. The model was further validated in additional cohorts spanning from March 1, 2018, to August 31, 2018, using 16 122 hospitalizations representing 13 094 unique adult patients admitted to hospital A, 6586 hospitalizations representing 5613 unique adult patients admitted to hospital B, and 4086 hospitalizations representing 3428 unique adult patients admitted to hospital C. The model was integrated into the production electronic health record system and prospectively validated on a cohort of 5273 hospitalizations representing 4525 unique adult patients admitted to hospital A between February 14, 2019, and April 15, 2019.
    Main Outcomes and Measures: The main outcome was in-hospital mortality. Model performance was quantified using the area under the receiver operating characteristic curve and area under the precision recall curve.
    Results: A total of 75 247 hospital admissions (median [interquartile range] patient age, 59.5 [29.0] years; 45.9% involving male patients) were included in the study. The in-hospital mortality rates for the training validation; retrospective validations at hospitals A, B, and C; and prospective validation cohorts were 3.0%, 2.7%, 1.8%, 2.1%, and 1.6%, respectively. The area under the receiver operating characteristic curves were 0.87 (95% CI, 0.83-0.89), 0.85 (95% CI, 0.83-0.87), 0.89 (95% CI, 0.86-0.92), 0.84 (95% CI, 0.80-0.89), and 0.86 (95% CI, 0.83-0.90), respectively. The area under the precision recall curves were 0.29 (95% CI, 0.25-0.37), 0.17 (95% CI, 0.13-0.22), 0.22 (95% CI, 0.14-0.31), 0.13 (95% CI, 0.08-0.21), and 0.14 (95% CI, 0.09-0.21), respectively.
    Conclusions and Relevance: Prospective and multisite retrospective evaluations of a machine learning model demonstrated good discrimination of in-hospital mortality for adult patients at the time of admission. The data elements, methods, and patient selection make the model implementable at a system level.
    DOI:  https://doi.org/10.1001/jamanetworkopen.2019.20733
  18. Anesthesiology. 2020 Jan 30.
       WHAT WE ALREADY KNOW ABOUT THIS TOPIC: Unplanned hospital readmissions are a focus of quality improvement, national benchmarking, and payment incentives in the United StatesThe accuracy of commonly used peer-reviewed readmission prediction algorithms at specific hospitals may be limited by hospital-specific factorsThe potential value of novel machine learning techniques capable of incorporating hundreds of patient, process, and hospital attributes is unclear WHAT THIS MANUSCRIPT TELLS US THAT IS NEW: Hospital-specific 30-day surgical readmission models using machine learning techniques provide clinically usable predictions when applied to future patientsA parsimonious approach limiting which data elements are considered performs as well as more comprehensive models BACKGROUND:: Although prediction of hospital readmissions has been studied in medical patients, it has received relatively little attention in surgical patient populations. Published predictors require information only available at the moment of discharge. The authors hypothesized that machine learning approaches can be leveraged to accurately predict readmissions in postoperative patients from the emergency department. Further, the authors hypothesize that these approaches can accurately predict the risk of readmission much sooner than hospital discharge.
    METHODS: Using a cohort of surgical patients at a tertiary care academic medical center, surgical, demographic, lab, medication, care team, and current procedural terminology data were extracted from the electronic health record. The primary outcome was whether there existed a future hospital readmission originating from the emergency department within 30 days of surgery. Secondarily, the time interval from surgery to the prediction was analyzed at 0, 12, 24, 36, 48, and 60 h. Different machine learning models for predicting the primary outcome were evaluated with respect to the area under the receiver-operator characteristic curve metric using different permutations of the available features.
    RESULTS: Surgical hospital admissions (N = 34,532) from April 2013 to December 2016 were included in the analysis. Surgical and demographic features led to moderate discrimination for prediction after discharge (area under the curve: 0.74 to 0.76), whereas medication, consulting team, and current procedural terminology features did not improve the discrimination. Lab features improved discrimination, with gradient-boosted trees attaining the best performance (area under the curve: 0.866, SD 0.006). This performance was sustained during temporal validation with 2017 to 2018 data (area under the curve: 0.85 to 0.88). Lastly, the discrimination of the predictions calculated 36 h after surgery (area under the curve: 0.88 to 0.89) nearly matched those from time of discharge.
    CONCLUSIONS: A machine learning approach to predicting postoperative readmission can produce hospital-specific models for accurately predicting 30-day readmissions via the emergency department. Moreover, these predictions can be confidently calculated at 36 h after surgery without consideration of discharge-level data.
    DOI:  https://doi.org/10.1097/ALN.0000000000003140
  19. Crit Care. 2020 02 06. 24(1): 42
       BACKGROUND: Previous scoring models such as the Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) and the Sequential Organ Failure Assessment (SOFA) scoring systems do not adequately predict mortality of patients undergoing continuous renal replacement therapy (CRRT) for severe acute kidney injury. Accordingly, the present study applies machine learning algorithms to improve prediction accuracy for this patient subset.
    METHODS: We randomly divided a total of 1571 adult patients who started CRRT for acute kidney injury into training (70%, n = 1094) and test (30%, n = 477) sets. The primary output consisted of the probability of mortality during admission to the intensive care unit (ICU) or hospital. We compared the area under the receiver operating characteristic curves (AUCs) of several machine learning algorithms with that of the APACHE II, SOFA, and the new abbreviated mortality scoring system for acute kidney injury with CRRT (MOSAIC model) results.
    RESULTS: For the ICU mortality, the random forest model showed the highest AUC (0.784 [0.744-0.825]), and the artificial neural network and extreme gradient boost models demonstrated the next best results (0.776 [0.735-0.818]). The AUC of the random forest model was higher than 0.611 (0.583-0.640), 0.677 (0.651-0.703), and 0.722 (0.677-0.767), as achieved by APACHE II, SOFA, and MOSAIC, respectively. The machine learning models also predicted in-hospital mortality better than APACHE II, SOFA, and MOSAIC.
    CONCLUSION: Machine learning algorithms increase the accuracy of mortality prediction for patients undergoing CRRT for acute kidney injury compared with previous scoring models.
    Keywords:  Acute kidney injury; Continuous renal replacement therapy; Intensive care unit; Machine learning; Mortality
    DOI:  https://doi.org/10.1186/s13054-020-2752-7
  20. Neurology. 2020 Feb 06. pii: 10.1212/WNL.0000000000009068. [Epub ahead of print]
       OBJECTIVE: Genetic diagnosis of muscular dystrophies (MDs) has classically been guided by clinical presentation, muscle biopsy, and muscle MRI data. Muscle MRI suggests diagnosis based on the pattern of muscle fatty replacement. However, patterns overlap between different disorders and knowledge about disease-specific patterns is limited. Our aim was to develop a software-based tool that can recognize muscle MRI patterns and thus aid diagnosis of MDs.
    METHODS: We collected 976 pelvic and lower limbs T1-weighted muscle MRIs from 10 different MDs. Fatty replacement was quantified using Mercuri score and files containing the numeric data were generated. Random forest supervised machine learning was applied to develop a model useful to identify the correct diagnosis. Two thousand different models were generated and the one with highest accuracy was selected. A new set of 20 MRIs was used to test the accuracy of the model, and the results were compared with diagnoses proposed by 4 specialists in the field.
    RESULTS: A total of 976 lower limbs MRIs from 10 different MDs were used. The best model obtained had 95.7% accuracy, with 92.1% sensitivity and 99.4% specificity. When compared with experts on the field, the diagnostic accuracy of the model generated was significantly higher in a new set of 20 MRIs.
    CONCLUSION: Machine learning can help doctors in the diagnosis of muscle dystrophies by analyzing patterns of muscle fatty replacement in muscle MRI. This tool can be helpful in daily clinics and in the interpretation of the results of next-generation sequencing tests.
    CLASSIFICATION OF EVIDENCE: This study provides Class II evidence that a muscle MRI-based artificial intelligence tool accurately diagnoses muscular dystrophies.
    DOI:  https://doi.org/10.1212/WNL.0000000000009068
  21. BMC Med Inform Decis Mak. 2020 Feb 06. 20(1): 21
       BACKGROUND: A common problem in machine learning applications is availability of data at the point of decision making. The aim of the present study was to use routine data readily available at admission to predict aspects relevant to the organization of psychiatric hospital care. A further aim was to compare the results of a machine learning approach with those obtained through a traditional method and those obtained through a naive baseline classifier.
    METHODS: The study included consecutively discharged patients between 1st of January 2017 and 31st of December 2018 from nine psychiatric hospitals in Hesse, Germany. We compared the predictive performance achieved by stochastic gradient boosting (GBM) with multiple logistic regression and a naive baseline classifier. We tested the performance of our final models on unseen patients from another calendar year and from different hospitals.
    RESULTS: The study included 45,388 inpatient episodes. The models' performance, as measured by the area under the Receiver Operating Characteristic curve, varied strongly between the predicted outcomes, with relatively high performance in the prediction of coercive treatment (area under the curve: 0.83) and 1:1 observations (0.80) and relatively poor performance in the prediction of short length of stay (0.69) and non-response to treatment (0.65). The GBM performed slightly better than logistic regression. Both approaches were substantially better than a naive prediction based solely on basic diagnostic grouping.
    CONCLUSION: The present study has shown that administrative routine data can be used to predict aspects relevant to the organisation of psychiatric hospital care. Future research should investigate the predictive performance that is necessary to provide effective assistance in clinical practice for the benefit of both staff and patients.
    Keywords:  Decision support techniques; Health services administration; Hospitals; Machine learning; Psychiatry
    DOI:  https://doi.org/10.1186/s12911-020-1042-2