Front Endocrinol (Lausanne). 2025 ;16 1514397
Background: Isolated Impaired Glucose Tolerance (I-IGT) represents a specific prediabetic state that typically requires a standardized oral glucose tolerance test (OGTT) for diagnosis. This study aims to predict glucose tolerance status in Chinese Han men at fasting state using machine learning (ML) models with demographic, anthropometric, and laboratory data.
Methods: The study population consisted of 1,117 Chinese Han men aged 50-87 years. Baseline variables including age, fasting plasma glucose (FPG), high blood pressure (HBP), body mass index (BMI), waist to hip ratio (WHR), total cholesterol (TC), triglyceride (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C) were collected from electronic medical records (EMRs) for machine learning model training and validation. Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), Adaptive Boosting (AdaBoost) and Gradient Boosting Machines (GBM) were tested for machine learning model performance comparison. Model performance was evaluated using metrics including accuracy, recall, F1 score, positive predictive value (PPV), negative predictive value (NPV), and the area under the receiver operating characteristic curve (AUC). Shapley Additive Explanations (SHAP) and confusion matrix plots were used for model interpretation.
Results: The RF model demonstrated the best overall performance with a 96.7% accuracy, recall of 91.4%, F1 score of 95.7%, PPV of 99.1%, and NPV of 95.6%. The AUC values for the SVM, DT, RF, LR, KNN, NB, AdaBoost, and GBM models were 0.97, 0.92, 0.96, 0.97, 0.88, 0.88, 0.97, and 0.97, respectively. While the RF model showed strong overall performance, the LR model had the highest AUC, indicating superior discriminatory power. FPG was identified as the most important predictor for I-IGT, followed by HDL, TC, HBP, BMI, and WHR. Individuals with FPG levels higher than 5.1 mmol/L were more likely to have I-IGT; the performance metrics for this cut-off value were: 89.35% accuracy, 89.79% recall, 85.22% F1 score, 81.09% PPV, 94.38% NPV, and 0.95 AUC.
Conclusion: Machine learning models based on demographic and clinical characteristics offer a cost-effective method for predicting I-IGT in Chinese Han men aged over 50, without the need for an OGTT. These models could complement existing early diagnostic strategies, thereby enhancing the early detection and prevention of diabetes. Additionally, FPG alone could serve as an efficient screening tool for the early identification of I-IGT in clinical settings.
Keywords: fasting plasma glucose; isolated impaired glucose tolerance; machine learning models; oral glucose tolerance test; pre-diabetes