NPJ Digit Med. 2020 ;3
51
Hospital systems, payers, and regulators have focused on reducing length of stay (LOS) and early readmission, with uncertain benefit. Interpretable machine learning (ML) may assist in transparently identifying the risk of important outcomes. We conducted a retrospective cohort study of hospitalizations at a tertiary academic medical center and its branches from January 2011 to May 2018. A consecutive sample of all hospitalizations in the study period were included. Algorithms were trained on medical, sociodemographic, and institutional variables to predict readmission, length of stay (LOS), and death within 48-72 h. Prediction performance was measured by area under the receiver operator characteristic curve (AUC), Brier score loss (BSL), which measures how well predicted probability matches observed probability, and other metrics. Interpretations were generated using multiple feature extraction algorithms. The study cohort included 1,485,880 hospitalizations for 708,089 unique patients (median age of 59 years, first and third quartiles (QI) [39, 73]; 55.6% female; 71% white). There were 211,022 30-day readmissions for an overall readmission rate of 14% (for patients ≥65 years: 16%). Median LOS, including observation and labor and delivery patients, was 2.94 days (QI [1.67, 5.34]), or, if these patients are excluded, 3.71 days (QI [2.15, 6.51]). Predictive performance was as follows: 30-day readmission (AUC 0.76/BSL 0.11); LOS > 5 days (AUC 0.84/BSL 0.15); death within 48-72 h (AUC 0.91/BSL 0.001). Explanatory diagrams showed factors that impacted each prediction.
Keywords: Health care economics; Outcomes research; Risk factors