bims-aukdir Biomed News
on Automated knowledge discovery in diabetes research
Issue of 2026–05–24
29 papers selected by
Mott Given



  1. Arch Endocrinol Metab. 2026 Aug 01. 70(4):
       OBJECTIVE: Research on the use of portable fundus cameras utilizing artificial intelligence (AI) for diabetic retinopathy (DR) screening in primary care remains limited. We aimed to evaluate the accuracy and reliability of DR screening in primary care using a smartphone-based, AI-assisted device in a small municipality in southern Brazil.
    MATERIALS AND METHODS: The reference standard was classification of fundus images by a retina specialist. Patients with diabetes enrolled in the Brazilian Family Health Program were recruited for the study. A general ophthalmologist obtained fundus images from 134 patients, and a retina specialist validated the DR diagnosis by AI.
    RESULTS: The sample was predominantly female, with most patients having type 2 diabetes mellitus (T2DM). The age ranged from 17 to 81 years. Blood pressure was controlled in 34.9% of the sample. HbA1c levels ranged from 5.4% to 13.9%, and 35.3% of participants had levels below 7.0%. After excluding eight participants due to low image quality, the DR prevalence was 24.6%. The AI-based screening test for DR in primary care demonstrated a sensitivity of 100% (95% CI 88.8-100) and a specificity of 66.3% (95% CI 55.9-75.7). The negative predictive value (NPV) was 100% (95% CI 94.3-100), and the positive predictive value (PPV) was 49.2% (95% CI 36.4-62.1).
    CONCLUSION: The smartphone-based, AI-assisted device showed good accuracy and excellent performance for DR screening in primary care. It can avoid unnecessary medical referrals and help prioritize patients with advanced disease who require early treatment to prevent severe complications.
    Keywords:  Diabetes mellitus; artificial intelligence; diabetic retinopathy; primary health care; screening
    DOI:  https://doi.org/10.20945/2359-4292-2026-0051
  2. Graefes Arch Clin Exp Ophthalmol. 2026 May 22.
       PURPOSE: Diabetic Retinopathy (DR), a serious complication of Diabetes and leading cause of blindness, demands accurate and early diagnosis. Current diagnostic challenges results from two major limitations: (i) lack of expertly annotated clinical datasets due to privacy and cost factors, and (ii) the nature of deep learning models that limits clinical adoption. This study presents a novel solution combining synthetic data generation with explainable machine learning approach to overcome these challenges.
    METHODS: In the proposed work, synthetic retinal fundus images were generated to simulate the Diabetic Retinopathy (DR) severity grading, enabling the generation of a balanced dataset with 1200 instances. Handcrafted features were retrieved using a pre-processing pipeline consisting of (Hue Saturation Value) HSV color space transformation, GLCM (Gray Level Co-occurrence Matrix) based texture analysis, and lesion specific quantification, ensuring clinically relevant feature representation. A Random Forest (RF) classifier optimized with Out of Bag (OOB) validation is implemented for DR severity grading. Shapley Additive Explanations (SHAP) based Explainability is integrated alongside the classifier to include interpretations of feature importance and decision making patterns.
    RESULTS: The proposed framework achieved an accuracy of 94% on synthetic data, outperforming several established deep learning models such as ResNet-50, VGG16 + DenseNet, EfficientNet-B0. Swarm plot analyses revealed that predictions were consistently aligned with better confidence values, and SHAP explanations highlighted the clinically interpretable features such as exudates, validating the reliability of the classifier.
    CONCLUSION: Incorporating the synthetic data with ground truth data through domain adaptation techniques could reduce the performance gaps and improve the transferability in clinical settings.
    Keywords:  Diabetic retinopathy; Explainable AI; Machine learning; Random forest classifier; SHAP; Synthetic data
    DOI:  https://doi.org/10.1007/s00417-026-07284-3
  3. Int J Med Inform. 2026 May 16. pii: S1386-5056(26)00228-5. [Epub ahead of print]216 106488
       BACKGROUND: Gestational diabetes mellitus (GDM) affects 15-25% of pregnancies worldwide and poses serious risks of macrosomia, preeclampsia, neonatal hypoglycaemia, and long-term type 2 diabetes. Existing machine learning models lack prospective multi-site external validation and formal physician trust evaluation, limiting real-world applicability.
    OBJECTIVES: To develop a clinically validated, explainable deep learning framework for GDM prediction using routinely available first-antenatal-visit clinical features, and to evaluate clinical readiness through dual-stage physician-in-the-loop (PITL) validation.
    METHODS: A TabNet binary classifier was developed on 3,525 clinical records using a three-stage feature-tailored hybrid imputation strategy (GAIN for HDL and OGTT; MissForest for Systolic BP; Mean for BMI). To prevent data leakage, SMOTE-based class balancing was applied exclusively within the training folds of a 5-fold stratified cross-validation pipeline, with validation folds remaining untouched. Explainability was delivered through TabNet intrinsic feature masks, SHAP, and LIME. Two-stage clinical validation comprised: (1) blinded PITL review by four certified obstetricians evaluating 30 patient cases with XAI explanations; and (2) prospective external validation across three independent Kerala hospitals totaling 80 patients.
    RESULTS: The proposed TabNet model achieved 97.13% accuracy, 94.05% precision, 98.91% recall, and 96.22% F1-score, outperforming ten baseline classifiers including Random Forest, XGBoost, and SVM under identical preprocessing conditions. Compared to recent state-of-the-art GDM prediction studies, the proposed model consistently outperformed comparable methods-under a rigorous 5-fold cross-validation strategy with confidence intervals, while most existing studies rely on single train-test splits without cross-validation. PITL validation yielded 96.7% concordance, an average Cohen's kappa of 0.909, and Fleiss' kappa of 0.963, with no prior GDM study reporting such formal physician endorsement. External multi-site F1 scores ranged from 83.70% to 87.00% across all three hospitals, reflecting an expected performance reduction in prospective real-world data, partly attributed to inter-site variability in feature availability and clinical data recording protocols. SHAP analysis identified a strong model-level interaction between PCOS and prediabetes as the dominant combined GDM risk signals, independently corroborated by all four obstetricians.
    CONCLUSION: The proposed framework integrates explainable deep learning with prospective dual-stage clinical validation, demonstrating promising performance as a clinically oriented proof-of-concept for the assessment of risk of GDM using routine clinical variables.
    TRIAL REGISTRATION: Clinical Trials Registry India, CTRI/2024/08/073158.
    Keywords:  Clinical validation; Deep learning; Explainable AI; Gestational diabetes mellitus; Multi-site validation; Physician-in-the-loop; TabNet
    DOI:  https://doi.org/10.1016/j.ijmedinf.2026.106488
  4. Stud Health Technol Inform. 2026 May 21. 336 348-352
      Women with a history of gestational diabetes mellitus (GDM) are at elevated risk of developing type 2 diabetes mellitus (T2DM) postpartum. This study explores the use of interpretable machine-learning models to examine associations between clinical, metabolic, lifestyle factors and postpartum diabetes status among women with history of GDM. A publicly available dataset of 1,496 women with prior GDM, comprising 29 medical, anthropometric, laboratory, and lifestyle variables, was analyzed. Eight classifiers were evaluated using 5-fold stratified cross-validation, including Logistic Regression (LogR), Decision Tree, Random Forest, Support Vector Machine, Naïve Bayes, K-Nearest Neighbor, XGBoost, and Multilayer Perceptron. Model performance was assessed using accuracy, recall, precision, F1-score, specificity, and AUC. LogR yielded the best accuracy (93.0%). Feature importance analyses identified HOMA-IR, C-peptide, uric acid, family history of diabetes, and AST as the strongest correlates of diabetes status. Given the lack of temporal information on feature measurement relative to diagnosis, results should be interpreted as associative rather than predictive. These findings demonstrate the utility of interpretable machine-learning approaches for exploratory postpartum diabetes risk stratification and underscore the need for future longitudinal validation.
    Keywords:  Gestational diabetes; artificial intelligence; machine learning; type 2 diabetes
    DOI:  https://doi.org/10.3233/SHTI260175
  5. Sci Rep. 2026 May 17. pii: 15226. [Epub ahead of print]16(1):
      Diabetic retinopathy (DR) is a leading cause of preventable blindness, motivating the development of reliable automated screening systems. This work proposes a Robust Deep Ensemble for Diabetic Retinopathy detection (RDE-DR) by analyzing ensemble fusion strategies. Four pre-trained convolutional neural networks (ResNet50, VGG16, VGG19, and DenseNet121) are trained using CLAHE-enhanced APTOS 2019 fundus images and integrated through seven heterogeneous fusion mechanisms, including voting-based, rank-based, and fuzzy-integral-inspired strategies. A consistent evaluation protocol is adopted, incorporating threshold optimization and probabilistic calibration analysis to validate robustness, decision margins, and accuracy-precision trade-offs. Experimental results show that multiple fusion techniques achieve comparable high performance and stable behavior on the APTOS 2019 benchmark, with the best configuration reaching 98.64% accuracy, 98.40% precision, 98.92% recall, 98.66% F1-score, and 99.78% Area-Under-Curve (AUC). Beyond peak accuracy, the study provides insights into ensemble reliability, calibration characteristics, and practical design choices for medical image classification systems. These results show that integrating transfer learning with CLAHE preprocessing and ensemble fusion yields stable experimental performance on the APTOS 2019 benchmark, suggesting potential for future medical decision support.
    Keywords:  Deep ensemble learning; Diabetic retinopathy; Fundus images; Image preprocessing; Threshold optimization; Transfer learning
    DOI:  https://doi.org/10.1038/s41598-026-48669-y
  6. PLoS One. 2026 ;21(5): e0347672
      Diabetes is a common chronic disease that needs early diagnosis and proper management to avoid severe complications. While current Artificial Intelligence (AI) tools generate predictive information, they often lack an integrated element for post-diagnosis support in order to fill in this critical gap in patient self-management. This research proposes and validates a hybrid System which aims to bridge this gap. The methodology is based on a novel, fused dataset (PIMA and Type 2 Diabetes) that was carefully preprocessed following a leakage-safe protocol in order to increase generalizability. The system architecture is a combination of two different critical components: Strong Bidirectional Long Short term Memory (BiLSTM) model for prediction and rule based engine for creating personalized lifestyle recommendations. In order to validate the efficacy of the BiLSTM model, seven traditional machine learning (ML) models and standard deep learning (DL) models have been comparatively tested, in which BiLSTM model has demonstrated a better generalization and prediction performance. Rigorous 10-fold cross validation was used to validate the system, which came up with an accuracy of 84.02%, precision of 87.89%, and recall of 80.50%. This research concludes that by successfully combining the high-performance predictive engine and real-time guidance module, it is possible to develop a holistic clinically relevant tool to close the loop between diagnosis and proactive self-management.
    DOI:  https://doi.org/10.1371/journal.pone.0347672
  7. Diabetes Res Clin Pract. 2026 May 16. pii: S0168-8227(26)00245-7. [Epub ahead of print] 113325
       AIMS: To develop a machine learning framework for predicting type 2 diabetes mellitus (T2DM) using administrative data and electronic health records (EHR) that could be applied in healthcare settings.
    METHODS: Study population included parents of individuals born in 1970-1990 who resided in Utah urban counties during 1990-2015. Two prediction models were developed using classification and regression tree (CART) methods. A "follow-back" design used data from 2010 to 2015 to predict T2DM incidence between 2016-2021. An "age-based" design used data from ages 40-45 to predict T2DM incidence between ages 46-50. Potential predictors included individual sociodemographic characteristics, family history of T2DM, and neighborhood environmental measures.
    RESULTS: The follow-back and age-based cohorts included 240,163 and 126,525 individuals, respectively. The final CART decision rules demonstrated high sensitivity (90-95%), with overweight status consistently selected as primary decision rule across study designs. Racial-ethnic minority populations and individuals living in urban/socioeconomically deprived areas were identified as having elevated risk for T2DM, even at younger ages and normal/underweight BMI.
    CONCLUSIONS: Application of machine learning models for T2DM prediction should be tailored to specific study designs and population characteristics and consider incorporating environmental data relevant to the local context. Opportunities exist to utilize administrative data and EHR for machine learning-based prediction.
    Keywords:  Administrative records; Built environment; Classification and regression trees; Diabetes; Family history; Machine learning; Recursive partitioning analysis
    DOI:  https://doi.org/10.1016/j.diabres.2026.113325
  8. NPJ Syst Biol Appl. 2026 May 18.
      Defining molecular pathways driving β-cell failure in type 2 diabetes (T2D) is challenging given donor heterogeneity. We developed an interpretable machine learning framework coupling sparse rule-based classification, pathway constrained modeling, and mitochondrial fitness stratification, applied to single-cell RNAseq from 52 human islet donors. A 50-gene classifier predicted T2D at single-cell resolution, outperforming ensemble models, with donor-level scores correlating with HbA1c. We identified a resilient non-diabetic (ND) β-cell subtype with preserved β-cell identity, while T2D β-cell subtypes showed cellular stress and suppressed oxidative phosphorylation. Mitophagy emerged as the dominant cellular pathway, with PINK1, BNIP3, and FUNDC1 as predictors. At the donor level, PINK1 expression decreased with T2D score and correlated with sex‑specific mitophagy patterns. We developed a mitochondrial fitness index (MFI, R² = 0.934) integrating mitophagy, proteostasis, biogenesis, and respiration, identifying PINK1, SQSTM1, PRKN, and BNIP3 as top T2D contributors. Interpretable machine learning revealed mitophagy as central to β-cell metabolic fitness.
    DOI:  https://doi.org/10.1038/s41540-026-00742-y
  9. BMC Ophthalmol. 2026 May 19.
       BACKGROUND: Early and reliable grading of diabetic retinopathy is important for preventing avoidable vision loss. Although deep learning methods have shown strong performance on retinal fundus images, grading remains difficult because of class imbalance, variable image quality, subtle differences between adjacent disease stages, and the need for interpretable predictions. This study developed a clinically interpretable deep learning pipeline for five-class diabetic retinopathy grading.
    METHODS: A classification framework based on EfficientNet-B1 was developed using contrast-limited adaptive histogram equalization for image enhancement, sample-mixing augmentation, focal loss, and test-time augmentation. Model behavior was further examined using calibration analysis, gradient-weighted class activation mapping, and t-distributed stochastic neighbor embedding. The public APTOS 2019 dataset was used. Under the fixed held-out protocol, 2,930 images were used for training, 364 for validation, and 366 for testing. Robustness was further examined using stratified 5-fold cross-validation on a 3,294-image development set with an independent 366-image test set. Performance was summarized using accuracy, macro F1-score, weighted F1-score, average area under the receiver operating characteristic curve, confusion matrices, and calibration error. No formal hypothesis-testing framework was applied.
    RESULTS: The best validation accuracy under the fixed split was 84.07%. On the independent test set, the proposed model achieved 83.61% accuracy without test-time augmentation and 84.97% with test-time augmentation. The average area under the receiver operating characteristic curve was 90.40%. Removing contrast-limited adaptive histogram equalization reduced test accuracy to 80.87%, supporting its contribution to performance. The model showed improved robustness to class imbalance and provided clinically meaningful visual explanations, although Severe diabetic retinopathy remained the most challenging category.
    CONCLUSIONS: The proposed pipeline achieved competitive benchmark performance for five-class diabetic retinopathy grading and combined classification accuracy with supportive calibration and interpretability analyses. The method appears suitable as a research-stage decision-support approach, but the remaining difficulty in advanced disease grading and the use of a single public dataset indicate that external validation is still required before clinical deployment.
    TRIAL REGISTRATION: This study used a publicly available benchmark dataset and did not involve a randomized controlled trial or prospective enrollment of human participants.
    Keywords:  Data augmentation; Deep learning; Diabetic retinopathy; EfficientNet-B1; GradCAM
    DOI:  https://doi.org/10.1186/s12886-026-04897-4
  10. Front Endocrinol (Lausanne). 2026 ;17 1834629
       Background: Early detection of diabetic retinopathy (DR) remains challenging in primary care, where access to ophthalmic screening is limited. We developed and validated a prediction model using routinely collected health data to identify diabetic patients at increased risk of DR.
    Methods: This retrospective study included 1,475 diabetic patients from three community health centers in China. The cohort was split into a development set (n = 1,177) and a held-out test set (n = 298). We developed three machine learning models using 5-fold cross-validation: penalized logistic regression (GLMNET), extreme gradient boosting (XGBoost), and random forest (Ranger). Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), Brier score, calibration, and decision curve analysis. Feature importance was assessed using SHapley Additive exPlanations (SHAP).
    Results: DR prevalence was 13.5%. In the test set, GLMNET achieved an AUROC of 0.770 (95% CI 0.671-0.856) and an AUPRC of 0.452 (95% CI 0.325-0.620). Its Brier score was 0.095, with a calibration intercept of 0.206 and a calibration slope of 0.953. XGBoost showed comparable discrimination, whereas Ranger performed less favorably. Decision curve analysis suggested possible net benefit across threshold probabilities from 10% to 40%. SHAP analyses identified urine glucose as the most influential predictor.
    Conclusions: This model showed moderate discrimination and acceptable but imperfect calibration in a three-center community-based cohort. Its use of routinely collected variables and transparent model structure suggests potential value for risk stratification in primary care, but external validation and prospective implementation studies are required before routine clinical use.
    Keywords:  clinical prediction model; diabetic retinopathy; machine learning; primary care; risk stratification
    DOI:  https://doi.org/10.3389/fendo.2026.1834629
  11. Sci Rep. 2026 May 19.
      Diabetic retinopathy (DR) is one of the major causes of preventable blindness in the world, and accurate large-scale screening tools are needed urgently. Most of the deep learning methods which have been developed for retinal image analysis are treating tasks like optic disc segmentation and DR grading separately. This separation is making it difficult for the model to use the shared anatomical and contextual cues which are linking the two tasks. So we are proposing GTAM-Net, a Gated Task-Attentive Multi-Task Network for retinal image analysis. GTAM-Net is performing optic disc segmentation and DR severity grading together inside a single end-to-end network. Inside the network, a gated task-attentive block is deciding how the features should be shared between the two tasks at each layer. In this way the network is keeping the useful complementary information for each task, and at the same time it is avoiding the negative transfer which often hurts multi-task models. We are also using a multi-scale feature pyramid for keeping the hierarchical context, and an uncertainty-based loss weighting so that one task is not dominating the training. The proposed method is tested on five public datasets: IDRiD, DDR, Messidor-2, APTOS, and REFUGE. The model is reaching up to 98.17% Dice score for optic disc segmentation and 99.12% accuracy for DR grading, and the performance of the proposed method is competitive on every dataset that we tried. The cross-dataset tests are also showing that the model is fairly stable when the imaging conditions are changing. From these results, the proposed multi-task design is appearing to be a useful and reasonably stable option for joint retinal image analysis, and it can be considered for use in large screening pipelines.
    Keywords:  Diabetic retinopathy grading; Gated attention network; Multi-task learning; Optic disc segmentation; Retinal image analysis
    DOI:  https://doi.org/10.1038/s41598-026-52418-6
  12. Photodiagnosis Photodyn Ther. 2026 May 20. pii: S1572-1000(26)00189-4. [Epub ahead of print] 105522
       INTRODUCTION: Evaluating retinal fundus image for diabetic retinopathy (DR) assessment is used to reduce the risk of blindness among diabetic patients. To do this, DR staging is one of the challenging tasks in screening DR that includes the assessment of disease severity or progression. We introduce a novel Radiomics and machine learning algorithm (MLA) framework that analyzes localized vascular leakage sources in Optical coherence tomography angiography (OCT-A) images, using fundus fluorescein angiography (FFA) as reference, to noninvasively stage DR. By extracting 23 optimized features from leakage-prone regions and employing X-Gradient (XGboost), Adaptive Boost, and Multilayer Perceptron (MLP) classifiers, our approach achieves unprecedented accuracy in differentiating moderate NPDR to early proliferative diabetic retinopathy.
    METHOD: We developed a novel radiomic and MLA framework to extract features (vascular, image-based, and clinical) from leakage source regions in OCT-A images identified by corresponding FFA images. XGboost, Adaptive Boost, MLP classifier were created to evaluate their diagnostic accuracy for different DR stages. MLA performance was comprehensively evaluated via multiple metrics (accuracy, precision, recall, AUC, sensitivity, specificity) and visualized through confusion matrices (CMs) and receiver operating characteristic (ROC) curves to ensure robust clinical applicability.
    RESULT: A dataset of 99 patients and 179 images was included. For staging in DR, the highest accuracy (96.6±1.4% [95% CI: 93.9-99.3%]) and AUC (98.9±0.5%) were observed for the XGBoost classifier with all image-based features alone. The MLP performed better with vascular and clinical features. (Accuracy: 92.5±3.2%92.5[95% CI: 86.2-98.8%], AUC: 95.2%) Entropy, Haralick, GLCM, eccentricity, vessel branch, vessel density, tortuosity (combining geometric analysis and wavelet transform), fractal dimension analysis (combining box counting and wavelet-based), and fast blood sugar emerged as particularly significant discriminative features across DR stages.
    CONCLUSION: Our approach identifies distinctive vascular and textural biomarkers-including hybrid fractal dimension, vessel tortuosity, and Haralick features-that effectively differentiate DR stages while providing new pathophysiological insights into microvascular remodeling in leakage-prone areas. By bridging non-invasive OCT-A with leakage-specific assessment, this method offers a clinically viable tool for early detection, progression prediction, and personalized DR management. Although limited by macular-centric scans and exclusion of very early/advanced stages, our findings establish a foundation for future multi-center studies with wider fields of view. This work represents a significant advance toward AI-enhanced ophthalmology, paving the way for actionable diagnostic paradigms in diabetic eye disease.
    Keywords:  Diabetic Retinopathy; Machine Learning; Optical Coherence Tomography Angiography; Radiomics; Tomography
    DOI:  https://doi.org/10.1016/j.pdpdt.2026.105522
  13. Front Endocrinol (Lausanne). 2026 ;17 1852512
       Background: Interstitial fibrosis and tubular atrophy (IFTA) are key pathological features of chronic kidney damage and progression in diabetic nephropathy (DN). Early identification of patients at higher risk of IFTA may support risk stratification, although reliable non-invasive tools remain limited. This study aimed to develop and validate machine learning (ML) models for predicting IFTA in patients with biopsy-confirmed DN.
    Methods: In this retrospective study, 232 patients with biopsy-confirmed DN from 2017 to 2025 were included and randomly divided into a training cohort (n = 164) and a validation cohort (n = 68). Baseline clinical and laboratory variables were collected. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation. Seven ML algorithms-logistic regression, support vector machine, random forest, XGBoost, LightGBM, decision tree, and artificial neural network-were developed. Model performance was evaluated using receiver operating characteristic curves, calibration plots, and decision curve analysis. Model interpretability was assessed using SHAP.
    Results: Seven predictors were identified, including diabetic retinopathy, age, proteinuria, estimated glomerular filtration rate (eGFR), triglycerides, duration of diabetes, and hemoglobin. Among the models, XGBoost achieved the highest AUC in the validation cohort, with an area under the curve (AUC) of 0.759, accuracy of 72.1%, sensitivity of 92.3%, specificity of 44.8%, and F1 score of 79.1%. Overall, the model showed moderate discrimination, with high sensitivity but limited specificity, suggesting potential value for exploratory risk screening rather than definitive clinical use. SHAP analysis indicated that higher proteinuria, triglycerides, presence of diabetic retinopathy, and longer diabetes duration, together with lower eGFR, hemoglobin, and younger age, were associated with an increased predicted risk of IFTA.
    Conclusion: ML models, particularly XGBoost, showed moderate performance in predicting IFTA in patients with biopsy-confirmed DN using routinely available clinical variables. These findings support the feasibility of an interpretable, non-invasive approach for exploratory risk estimation of tubulointerstitial injury. However, because of the modest sample size, limited specificity, relatively high false positive rate, and lack of external validation, the present results should be considered preliminary and require further validation before clinical use.
    Keywords:  SHAP; diabetic nephropathy; interstitial fibrosis; machine learning; tubular atrophy
    DOI:  https://doi.org/10.3389/fendo.2026.1852512
  14. JMIR Diabetes. 2026 May 20. 11 e85372
       Background: The rate of treatment failure with sodium-glucose cotransporter-2 inhibitors (SGLT2i) is high among individuals with type 2 diabetes (T2D). Accurately predicting SGLT2i treatment failure is important for improving the clinical management of T2D.
    Objective: The study aimed to use machine learning (ML) models to identify factors predicting treatment failure with SGLT2i in T2D and to evaluate model performance.
    Methods: This retrospective observational cohort study included adults with T2D treated with SGLT2i (2016-2024). The primary outcome was overall treatment failure with SGLT2i during follow-up (≥180 days after SGLT2i initiation). The secondary outcome was subtypes of treatment failure with SGLT2i (treatment discontinuation, failure with action, and inertial failure) or nonfailure, which was defined as not meeting the definition for one of the failure subtypes. Variables potentially associated with treatment failure were assessed during the year before SGLT2i treatment initiation (analysis 1) and the year before SGLT2i treatment failure (analysis 2). Using these variables, ML models-logistic regression (LR), multilayer perceptron (MLP), extreme gradient boosting (XGBoost), and Transformer-were used to identify significant predictors of the outcomes. Model performance metrics (accuracy, area under the curve, precision, recall, and F1-score) were calculated. Using Shapley Additive Explanations methodology, key features were identified based on their impact on model predictions. LR and Transformer models using key features were further evaluated for their potential to support the development of a risk score for predicting treatment failure with SGLT2i.
    Results: Among all individuals in the study (N=62,222), 71% (n=44,156) had treatment failure with SGLT2i. Across subtypes, failure with action (n=23,839, 38.3%) was more common than treatment discontinuation (n=16,449, 26.4%) and inertial failure (n=3868, 6.2%). Model performance was moderate in both analyses. In analysis 1, the accuracy ranged from 0.72 to 0.73 for predicting overall treatment failure and from 0.56 to 0.57 for predicting the subtype of treatment failure. In analysis 2, the accuracy ranged from 0.74 to 0.75 for predicting overall treatment failure and from 0.61 to 0.63 for predicting the subtype of treatment failure. XGBoost, MLP, and Transformer models showed small improvements compared with LR. Using the top 9 key features identified from the Shapley Additive Explanations analysis, the Transformer model performed similarly in accuracy and area under the curve to its counterpart using the full feature set.
    Conclusions: Performance across the LR, MLP, XGBoost, and Transformer models was moderate. The advanced ML models performed slightly better than LR. Overall, the results suggest that further model advancements and increased data availability are needed to better predict treatment failure with SGLT2i. The LR coefficients from the key features model may inform the development of a risk score to predict SGLT2i treatment failure. Accurate prediction could inform individualized treatment planning for individuals with T2D.
    Keywords:  SGLT2i; artificial intelligence; machine learning; sodium-glucose cotransporter-2 inhibitors; treatment failure; type 2 diabetes
    DOI:  https://doi.org/10.2196/85372
  15. Digit Health. 2026 Jan-Dec;12:12 20552076261453207
       Background: Diabetes mellitus affects approximately 589 million adults worldwide, with a large proportion remaining undiagnosed until complications arise. Accurate, data-driven early detection tools are urgently needed to support timely clinical intervention.
    Objectives: This study aimed to develop a hybrid ensemble framework integrating multiple feature selection strategies with a consensus approach to improve diabetes prediction accuracy and clinical interpretability.
    Methods: A publicly available dataset of 1,879 patients with 46 features was analysed. Six interaction features (e.g., HbA1c/FBS ratio, Age×BMI) were engineered. Four supervised selection methods, namely Recursive Feature Elimination (RFE), Random Forest, ANOVA, and Mutual Information, were combined with Principal Component Analysis (PCA), and a consensus criterion (≥3 of 4 supervised methods) defined the final feature subset. Seven classifiers were individually optimised via GridSearchCV with 5-fold stratified cross-validation and integrated into a soft voting ensemble, evaluated on a stratified held-out test set (20%, n = 376) using accuracy, precision, recall, F1-score, and AUC-ROC with 95% confidence intervals.
    Results: Eight consensus features were identified, namely HbA1c, fasting blood sugar, hypertension, excessive thirst, frequent urination, cholesterol LDL, BP_diff, and HbA1c/FBS ratio, reducing dimensionality by 82.6% (46 to 8 features). The soft voting ensemble achieved an AUC of 0.948 (95% CI: 0.926-0.970), accuracy of 92.6% (95% CI: 0.890-0.952), precision of 91.8%, recall of 89.4%, and F1-score of 90.6%, outperforming all individual classifiers in recall and F1-score.
    Conclusion: The proposed framework combines supervised and unsupervised feature selection with ensemble learning, yielding a clinically interpretable and high-performing diabetes prediction model. Its consensus-driven feature transparency and robust generalisation support deployment in early screening and digital health applications. Future work should prioritise external multi-centre validation, explainable AI integration, and real-time clinical decision support development.
    Keywords:  diabetes prediction; ensemble learning; feature selection; majority voting
    DOI:  https://doi.org/10.1177/20552076261453207
  16. Metabolism. 2026 May 20. pii: S0026-0495(26)00160-5. [Epub ahead of print] 156650
       BACKGROUND: Type 2 diabetes (T2D) causes multisystem complications, but an integrated multi-omics framework for cross-system, multi-outcome analysis is lacking. We aimed to comprehensively construct the proteomic and metabolomic atlas of major T2D outcomes and to identify predictive panels that balance performance and clinical feasibility.
    METHODS: Among UK Biobank participants with T2D, we established proteomic (n = 3104), metabolomic (n = 28,834), and multi-omics (n = 3059) subcohorts. Using cross-sectional and longitudinal analyses, we systematically evaluated the associations of plasma proteins and metabolites with 19 T2D-related outcomes. Predictive models were developed using machine learning-based molecular feature selection and were compared with the clinical risk model.
    RESULTS: The study identified molecular signals that consistently exhibited positive or negative associations across multiple T2D outcomes, revealing shared biological pathways. We also uncovered outcome-specific and heterogeneous molecular signatures. Furthermore, protein-based models substantially outperformed clinical models (median delta C-index = 0.108; range: 0.063-0.143), while combined models achieved the best performance (median delta C-index = 0.109; range: 0.080-0.150) with consistent improvements in reclassification metrics, whereas metabolites provided only modest incremental gains (median delta C-index = 0.027; range: 0.006-0.070). Evaluation across varying selection thresholds identified a simplified panel of 174 proteins that maintained robust predictive performance.
    CONCLUSION: This large-scale multi-omics study systematically constructs the molecular atlas of T2D complications, providing new insights into disease biology and potential therapeutic targets. It further defines the predictive value of proteomic and metabolomic profiles and proposes a clinically feasible and practical framework for risk prediction and precision intervention.
    Keywords:  Machine learning; Metabolomics; Molecular atlas; Multisystem complications; Predictive model; Proteomics; Type 2 diabetes
    DOI:  https://doi.org/10.1016/j.metabol.2026.156650
  17. Diabetes Ther. 2026 May 19.
      With the rising prevalence of type 2 diabetes (T2D) among children and adolescents, the ability to predict the progression of prediabetes to T2D in youth is imperative, as it significantly affects long-term health and quality of life. A number of biological and social risk factors have been identified in literature. Additionally, a growing body of literature illustrates the use of machine learning techniques to identify those with prediabetes or T2D in adults. A prediction model identifying adults with prediabetes who are most likely to progress to T2D has been suggested; however, to date, no such predictive algorithm in youth with prediabetes, a population with a high rate of spontaneous regression to normoglycemia, has been developed. Machine learning (ML) techniques can be applied to potentially identify novel risk factors for youth-onset T2D progression. The use of ML techniques with longitudinal data would enrich the prediction models to accurately identify children with prediabetes who are at the most risk of developing T2D without intervention. This narrative review summarizes the literature on biological and social risk factors for T2D progression and the use of ML to predict progression to T2D.
    Keywords:  Machine learning; Pediatric type 2 diabetes; Prediabetes; Prediction models; Risk factors
    DOI:  https://doi.org/10.1007/s13300-026-01873-5
  18. Best Pract Res Clin Endocrinol Metab. 2026 May 09. pii: S1521-690X(26)00041-2. [Epub ahead of print] 102119
      Type 2 diabetes disproportionately affects South Asian populations, who typically develop earlier onset disease associated with a high lifetime burden of complications. Conventional tools to predict diabetes complications, largely derived from European-ancestry cohorts and traditional clinical variables, perform suboptimally in South Asians and leave substantial residual risk unexplained. This review synthesises evidence on non-conventional predictors of microvascular and macrovascular complications in South Asian adults with type 2 diabetes. We examine life-course and early developmental determinants, ectopic fat, socioeconomic and environmental exposures, psychological and behavioural factors, non-conventional glycaemic metrics, novel organ-specific biomarkers, genetics, and artificial intelligence approaches. Although many predictors show promise, most lack validation in longitudinal South Asian cohorts and meaningful predictive value beyond existing clinical risk scores. Integrating diverse data types may enable more precise risk stratification, but robust external validation and evidence of improved patient outcomes are required before implementation in routine diabetes care.
    Keywords:  South Asians; artificial intelligence; biomarkers; diabetes complications; precision medicine; risk factors; risk prediction; risk score; type 2 diabetes
    DOI:  https://doi.org/10.1016/j.beem.2026.102119
  19. Prog Mol Biol Transl Sci. 2026 ;pii: S1877-1173(26)00101-8. [Epub ahead of print]222 295-326
      The integration of multi-omics technologies has significantly advanced the understanding of the complex and multifactorial pathophysiology of Type 2 Diabetes. In contrast to traditional single-omics approaches, multi-omics integration enables a comprehensive, systems-level characterization of molecular interactions, regulatory mechanisms, and disease progression pathways. This holistic framework facilitates the discovery of novel biomarkers, early diagnostic signatures, and actionable therapeutic targets by linking molecular variations with clinical phenotypes. However, the high dimensionality, heterogeneity, and scale of multi-omics data present substantial analytical challenges. Recent advances in artificial intelligence (AI) have addressed these limitations by providing powerful computational tools for effective data integration, pattern recognition, and predictive modeling. AI-driven approaches, particularly deep learning architectures, enable automated feature extraction and nonlinear relationship mapping across omics layers, thereby improving disease risk prediction and patient stratification. Collectively, the convergence of AI and multi-omics is transforming diabetes research toward predictive, preventive, and personalized healthcare paradigms.
    Keywords:  Artificial intelligence; Epigenomics; Genomics; Machine learning; Metabolomics; Proteomics; Transcriptomics; Type 2 diabetes
    DOI:  https://doi.org/10.1016/bs.pmbts.2026.02.002
  20. Stud Health Technol Inform. 2026 May 21. 336 494-495
      We developed a machine learning model to estimate the personalized risk of cardiovascular (CV) death within 5-years among obese/overweight people with prediabetes. The model has the potential to enhance early preventive CV strategies.
    Keywords:  Cardiovascular death; Machine Learning; Obesity; Prediabetes
    DOI:  https://doi.org/10.3233/SHTI260209
  21. Stud Health Technol Inform. 2026 May 21. 336 665-669
      Prediabetes (PD), a reversible metabolic condition that precedes type 2 diabetes (T2DM), carries a high risk of progression to T2DM, but can be effectively treated by restoring normal blood glucose levels, mainly through lifestyle changes such as diet and physical activity. This study uses data from 21'023 patients from Canadian primary care Electronic Medical Records (EMRs), expanding a pre-existing clinical dataset by adding detailed information on drug prescriptions and smoking status, to train and compare different machine learning models to predict future normoglycemia, PD, or T2DM. The use of explainability techniques enhances our understanding of the progression from PD to T2DM, emphasizing the crucial role of blood sugar levels, body mass index, cholesterol levels and blood pressure in determining T2DM risk. These results highlight the potential of routinely collected clinical data to support early identification of high-risk individuals, enabling preventive interventions. Future studies should validate these findings in prospective cohorts and expand the set of investigated features to further improve the model's predictive performance and generalizability.
    Keywords:  EMR; Explainable AI; Prediabetes; SHAP; Type 2 diabetes
    DOI:  https://doi.org/10.3233/SHTI260254
  22. Diabetes Obes Metab. 2026 May 19.
    MELISSA consortium
       AIMS: Achieving optimal glycaemic control remains a burden for many people with diabetes on intensive insulin treatment. The MELISSA trial aims to clinically validate the artificial intelligence (AI)-based MELISSA system to support people with Type 1 diabetes on multiple daily insulin injections (MDI) with personalised insulin dose recommendations and an innovative approach for automatic carbohydrate estimation. In addition, feasibility will be explored in people with Type 2 diabetes.
    MATERIALS AND METHODS: The MELISSA trial is a 22-week European multicentre, prospective randomised open-label blinded endpoint trial including people with Type 1 (n = 278) and an exploratory cohort of people with Type 2 diabetes (n = 50) on MDI. The MELISSA system consists of two AI-driven features: an adaptive basal-bolus advisor (ABBA), which can be supported by an automated dietary assessment system, goFOODTM. ABBA provides personalised suggestions for basal insulin and mealtime insulin doses, while goFOODTM converts food images into carbohydrate content estimations. After the initialisation period (Weeks 0-6), participants will be randomised 1:1 to either the MELISSA system or the continuation of usual care.
    RESULTS: The primary endpoint is the between-group change in percentage of time spent in the target range (3.9-10.0 mmol/L [70-180 mg/dL]) from baseline to study end. Secondary outcomes include additional glycaemic metrics, patient-reported-outcomes, and safety information. The trial was approved by the Sponsor's Medical Ethics Research Committee (NL-009099).
    CONCLUSION: The MELISSA system will potentially improve glycaemic control and quality of life in people with Type 1 diabetes on MDI. The trial outcomes will provide necessary input for obtaining Conformité Européenne certification (class IIb) for the MELISSA system.
    Keywords:  artificial intelligence; automatic carbohydrate counting; basal‐bolus calculator; continuous glucose monitoring; diabetes management; diabetes technology; insulin therapy
    DOI:  https://doi.org/10.1111/dom.70879
  23. Transl Vis Sci Technol. 2026 May 01. 15(5): 12
    Current Research Directions in Ophthalmology Working Group
       Purpose: To determine how optical coherence tomography (OCT) scan density affects quantification of artificial intelligence (AI)-derived structural biomarkers in diabetic macular edema (DME) and to identify density thresholds beyond which biomarker fidelity is compromised.
    Methods: In this cross-sectional study, 401 DME eyes underwent three same-session OCT acquisitions using 97-, 49-, and 25-B-scan raster protocols on a single device. A CE-certified deep learning pipeline quantified intraretinal fluid (IRF) volume, subretinal fluid (SRF) volume, inflammatory hyperreflective foci (I-HRF), and photoreceptor integrity metrics. Linear mixed-effects models assessed density effects, Bland-Altman analyses quantified fixed and proportional bias, and volumetric thresholds were computed for deviations beyond ±0.10 mm³. Acquisition efficiency integrated biomarker variability and scan time.
    Results: A total of 9624 biomarker measurements were analyzed with >98% completeness. SRF volume, I-HRF counts, and photoreceptor integrity metrics were stable across scan densities. IRF volume was density-dependent: the 25-scan protocol overestimated IRF relative to 97- and 49-scan acquisitions (mean bias -0.077 and -0.079 mm³; both P < 0.001), whereas 97- and 49-scan measurements were interchangeable. Overestimation increased with fluid burden (IRF threshold ∼1.1 mm³). Although the 25-scan protocol was fastest (10.7 seconds vs. 23.6 seconds and 50.3 seconds), the 49-scan protocol provided the best balance between speed and precision.
    Conclusions: Most AI-derived OCT biomarkers in DME are robust to reduced scan density, but IRF volume shows increasing error with undersampling. Higher-density scans should be reserved when precise fluid quantification is required.
    Translational Relevance: Scan density materially influences AI-derived IRF quantification. Identifying practical acquisition thresholds enables protocol standardization while reducing imaging burden in clinical practice and trials.
    DOI:  https://doi.org/10.1167/tvst.15.5.12
  24. Int J Retina Vitreous. 2026 May 16.
       PURPOSE: To propose inter-disease out-of-domain generalization (OODG) across retinal diseases for microaneurysm (MA) segmentation using a deep-learning model trained and validated on diabetic retinopathy (DR) and qualitatively evaluated on leukemic retinopathy (LR).
    METHODS: A U-Net based segmentation model was trained using the IDRiD dataset, which comprises 81 DR images, using only MA annotations. The images were split into patches, and a statistical filtering step was applied to retain only structurally homogeneous patches. The study was organized in two phases: in Phase I, the U-Net was trained and evaluated using DR patches; in Phase II, the model was tested directly, without retraining, on LR images. Finally, MA segmentations were subjected to qualitative assessment by clinical specialists.
    RESULTS: The proposed U-Net achieved an IoU of 0.842, a Dice score of 0.914, an accuracy of 0.998, and a validation loss of 0.120.
    CONCLUSION: The results suggest that knowledge learned from DR can generalize effectively to a related clinical context such as LR, opening the possibility of reusing models in diseases with similar structures and lesion patterns.
    Keywords:  Diabetic retinopathy; Leukemic retinopathy.; Microaneurysm; Out of domain generalization; Semantic segmentation
    DOI:  https://doi.org/10.1186/s40942-026-00844-z
  25. Curr Comput Aided Drug Des. 2026 May 10.
       INTRODUCTION: Diabetes mellitus is a chronic metabolic disease characterized by longterm hyperglycaemia. Prolonged illness may lead to serious complications in vital organs, including the kidneys, nerves, and cardiovascular system. Therefore, early prevention of diabetes is of paramount importance.
    METHODS: In this study, we propose a novel integrated learning method (FT-TRF) that combines the capabilities of FT-Transformer and stochastic Senri for efficiently identifying potential antidiabetic compounds. The proposed method combines the global feature interaction capabilities of FT-Transformer with the local feature dependency modeling of Random Forest (RF). It incorporates feature engineering and selection to identify important features, enhancing the data representation capability of the FT-Transformer model. The predictive probabilities from both models are integrated through a dynamic linear weighting mechanism, enhancing the model's generalization ability.
    RESULTS: We validated the proposed model through experiments using the most recent diabetesrelated compounds dataset. The results demonstrate that our hybrid model outperforms traditional classifiers across multiple metrics, including the Area Under the Curve (AUC), sensitivity, specificity, Kappa coefficient, Matthews correlation coefficient (MCC), F1 Score, Precision- Recall (PR) curve, and Receiver Operating Characteristic (ROC) curve. In addition, our model correctly screened compounds related to diabetes on other datasets.
    DISCUSSION: Our method outperforms others in identifying diabetes-related compounds, showing a robust F1 Score and AUC. To enhance the statistical rigor of these findings, future studies will apply multiple-comparison corrections, such as Bonferroni or false discovery rate control.
    CONCLUSION: The findings are summarized primarily through AUC and F1 Score, which serve as comprehensive comparison measures. These results confirm the superior performance of our integrated method for identifying diabetes-related compounds.
    Keywords:  Diabetes; FT-transformer; deep learning; drug identification.; network pharmacology; random forest
    DOI:  https://doi.org/10.2174/0115734099423682251202211016
  26. Nutrition. 2026 Apr 18. pii: S0899-9007(26)00143-7. [Epub ahead of print] 113235
      
    Keywords:  Artificial intelligence; Carbohydrate counting; Dietitian; Turkish food; Type 1 diabetes
    DOI:  https://doi.org/10.1016/j.nut.2026.113235
  27. Int J Mol Med. 2026 Jul;pii: 198. [Epub ahead of print]58(1):
      Diabetic kidney disease (DKD) represents a major complication associated with diabetes mellitus, notably contributing to patient morbidity and mortality. However, early diagnosis of DKD remains challenging due to the lack of clear diagnostic biomarkers. Therefore, in the present study, microarray and RNA‑sequencing data from the Gene Expression Omnibus database were systematically analyzed. Using differential expression and weighted gene co‑expression network analysis, 49 genes with marked expression changes in DKD were identified. Subsequent analyses, including functional enrichment, protein‑protein interaction network construction, machine learning techniques and assessment of immune cell infiltration, led to the identification of three hub genes: Spleen‑associated tyrosine kinase, apoptotic peptidase activating factor 1 and ADAM metallopeptidase domain 10, as promising diagnostic markers, which were further evaluated by receiver operating characteristic curve analysis. Expression changes of the identified hub genes were validated in both DKD mouse models and clinical patient samples. Collectively, the present study provided a novel perspective on the molecular basis of DKD, and highlighted novel candidates for potential diagnostic and therapeutic applications.
    Keywords:  biomarkers; diabetic kidney disease; immune cell infiltration; machine learning; weighted gene co‑expression network analysis
    DOI:  https://doi.org/10.3892/ijmm.2026.5869
  28. BMJ Open. 2026 May 20. 16(5): e115755
       OBJECTIVES: To elicit stated preferences and willingness-to-pay (WTP) for artificial intelligence (AI)-enabled blended care in type 2 diabetes mellitus (T2DM), and to examine preference heterogeneity by digital experience and socioeconomic status (SES).
    DESIGN: Cross-sectional discrete choice experiment (DCE).
    SETTING: 12 community health centres in Jiaozuo and Puyang, Henan Province, China. Data were collected between June and August 2025.
    PARTICIPANTS: 423 adults diagnosed with T2DM for at least 6 months, recruited using consecutive convenience sampling from routine follow-up appointments. Of 769 participants who completed the survey, 346 were excluded following prespecified data quality criteria (retention rate: 55.0%).
    OUTCOME MEASURES: Outcome measures included preference weights and WTP (in Chinese Yuan, ¥) for five DCE attributes: monthly subscription fee, recommendation source, feedback modality, in-person follow-up frequency and expert oversight, estimated using mixed logit models. Simulated uptake probabilities for tailored service packages across four user profiles were computed.
    RESULTS: Among 423 participants, 80.6% had never used AI tools. Price was the dominant driver of choice (62.7% relative attribute importance). Profound preference heterogeneity emerged across subgroups: rural residents (n=78) were highly price-sensitive but preferred physician endorsement (WTP ¥17.58 (US$2.55), 95% CI ¥5.98 to ¥29.17); female participants (n=224) valued guideline recommendations (WTP ¥18.45 (US$2.67), 95% CI ¥7.81 to ¥29.40); and diabetes app users (n=34) were the least price-sensitive but showed a negative preference for AI instant feedback, instead preferring human dietitian feedback. Expert oversight carried a consistent negative WTP across all profiles. Targeting tailored service bundles to intended subgroups increased uptake by 8-16 percentage points compared with non-targeted bundles.
    CONCLUSIONS: A 'digital experience paradox' exists whereby digitally experienced users view human interaction as a premium service, while underserved groups rely on specific trust markers such as physician endorsement. To avoid widening the digital divide, AI-enabled blended diabetes care must move beyond standardised models towards configurable, equity-driven service pathways.
    Keywords:  Artificial Intelligence; Diabetes Mellitus, Type 2; Patient Preference
    DOI:  https://doi.org/10.1136/bmjopen-2025-115755