BMC Pregnancy Childbirth. 2026 Jul 01.
BACKGROUND: Gestational diabetes mellitus (GDM) is a common metabolic disorder during pregnancy, leading to adverse maternal and neonatal outcomes. Exosomal microRNAs (exo-miRNAs) have emerged as promising noninvasive biomarkers due to their stability and regulatory roles in glucose metabolism. However, robust diagnostic models integrating exo-miRNAs profiles for early prediction of GDM remain lacking.
METHODS: In this study, we used the GSE192813 dataset as a discovery cohort to identify differentially expressed exo-miRNAs (DE-exo-miRNAs) in exosomes between GDM and normal glucose tolerance (NGT) pregnancies. After differential expression analysis, five machine learning (ML) feature selection algorithms (LASSO, Random Forest, SVM-RFE, XGBoost, and Boruta) were applied to identify robust predictive DE-exo-miRNAs features. Subsequently, ten classification algorithms (including Logistic Regression, Random Forest, SVM, XGBoost, LightGBM, CatBoost, KNN, Naïve Bayes, Neural Network, and Decision Tree) were combined with the five feature-selection methods, generating 50 distinct ML models. Model performance was evaluated through repeated 7:3 train-test splits, and the best-performing classifier was externally validated using GSE114860.
RESULTS: A total of 12 DEmiRNAs were identified in GSE192813, of which a subset of key exo-miRNAs (including miR-423-5p, miR-99a-5p, miR-148a-3p, miR-192-5p, and miR-122-5p) were consistently selected across multiple algorithms. Among the 50 ML combinations, the XGBoost + Boruta model achieved the highest diagnostic accuracy, with an AUC exceeding 0.90 and an overall accuracy greater than 90% in the discovery dataset. External validation in GSE114860 demonstrated stable performance, achieving an accuracy above 80% and good calibration. Functional enrichment analysis of target genes indicated significant involvement in insulin signaling, lipid metabolism, and inflammatory pathways.
CONCLUSION: This integrative machine learning framework successfully identified a robust exo-miRNAs-based predictive signature for GDM. The model exhibited high diagnostic accuracy and generalizability across independent cohorts, highlighting its potential for early, noninvasive screening and precision management of gestational diabetes mellitus.
Keywords: Biomarkers; Early prediction; Exo-miRNAs; Gestational diabetes mellitus (GDM); Machine learning