bims-aukdir Biomed News
on Automated knowledge discovery in diabetes research
Issue of 2025–11–09
fourteen papers selected by
Mott Given



  1. Sci Rep. 2025 Nov 07. 15(1): 39047
      Diabetic Retinopathy (DR) is a progressive complication of diabetes and a leading cause of preventable blindness worldwide. Early detection and accurate classification of DR severity are critical for timely intervention but remain challenging, particularly in resource-constrained settings. While conventional deep learning (DL) models based on Convolutional Neural Networks (CNNs) have shown promising results, they often struggle to capture long-range dependencies in retinal fundus images and typically require substantial computational resources, limiting their utility on low-cost hardware. To address these challenges, this study introduces a Task-Optimized Vision Transformer (TOViT) model, specifically designed for DR detection and severity classification. The model integrates several optimization strategies, including layer-wise learning rate scheduling, attention head tuning, and embedding dimension refinement, to enhance feature extraction while maintaining computational efficiency. The model is further compressed through structured pruning and 8-bit quantization to support real-time deployment on Raspberry Pi-4 hardware. Evaluated on three large-scale public datasets, TOViT achieved a classification accuracy of 99%, with F1-scores exceeding 93% across all DR stages. Hardware implementation yielded real-time performance, processing at 8 frames per second with 120 ms latency, confirming its potential for use in portable, point-of-care screening devices. This work presents a scalable and clinically relevant approach for automated DR diagnosis, with promising implications for expanding access to early retinal screening in global healthcare systems.
    Keywords:  Deep learning; Diabetic retinopathy; Lightweight model; Real-time screening; Resource-constrained healthcare; Vision transformer
    DOI:  https://doi.org/10.1038/s41598-025-25399-1
  2. Arch Endocrinol Metab. 2025 Nov 06. 69(6): e250168
       OBJECTIVE: To evaluate and to compare machine learning models for predicting hypertension in patients with diabetes using routine clinical variables.
    METHODS: Using Behavioral Risk Factor Surveillance System data, models were trained on 35,346 individuals with seven variables ("HighChol", "BMI", "Smoker", "PhysActivity", "Sex", and "Age") to predict the occurrence of hypertension in patients with diabetes ("HTNinDM"). Models included neural network, gradient boosting, random forest, Adaptive Boosting, and logistic regression. Performance was assessed by area under the curve, accuracy, precision, and recall, and F1 score using cross-validation. Class imbalance was addressed via diverse models. Feature importance was evaluated by permutation importance of a random forest model.
    RESULTS: The neural network model achieved the best performance with area under the curve 0.689, accuracy 76.5%, precision 76.3%, recall 98.8%. Gradient boosting models performed similarly. Age and body mass index were the top predictors.
    CONCLUSION: Machine learning models show potential for identifying patients with diabetes at high hypertension risk using routine clinical data. A neural network model achieved excellent predictive performance.
    Keywords:  Diabetes mellitus; Hypertension; Machine learning
    DOI:  https://doi.org/10.20945/2359-4292-2025-0168
  3. Med Biol Eng Comput. 2025 Nov 07.
      Diabetic retinopathy (DR), a common diabetic complication, stands as a primary cause of retinal blindness. Rapid automatic DR grading is a crucial approach for prevention. Although deep learning has shown promising results in automated DR grading, its deployment in clinical settings remains challenging due to variations in imaging devices, lighting conditions and other conditions in different hospitals. This study explores the problem of decreased generalization performance in DR grading, stemming from variations in the distributions of the source and target domains, namely addressing the domain generalization (DG) problem in DR grading. To tackle this challenge, we propose a novel Fourier-based domain generalization framework for fundus images, which consists of three key innovation elements: (1) Fourier spectrum enhancement: a Fourier-based image enhancement technique preserves critical high-frequency features by leveraging phase information, significantly improving cross-domain robustness; (2) Collaborative teacher-student knowledge distillation: a dual-network learning mechanism enhances generalization by distilling high-level semantic features from a teacher model to a student model; (3) Feature fusion: a feature fusion module effectively differentiates intra-class and inter-class features, thereby further enhancing classification performance. Extensive evaluation of our framework on six clinically realistic DR datasets demonstrates superior generalization performance compared to existing methods. Furthermore, this study reveals the critical role of Fourier phase information and high-level semantic features in improving generalization, bridging an important research gap in DR grading.
    Keywords:  Data augmentation; Diabetic retinopathy; Domain generalization; Image classification
    DOI:  https://doi.org/10.1007/s11517-025-03469-w
  4. Clin Exp Ophthalmol. 2025 Nov 03.
      Artificial intelligence (AI) has comparable accuracy to ophthalmologists for diabetic retinopathy (DR) screening, yet its cost-effectiveness is crucial for implementation. Our review of 18 health economic analyses of AI versus manual grading for DR found significant methodological variation, with cost-utility analysis and Markov modelling being the commonest evaluation and modelling approaches, respectively. We identified three key considerations when appraising health economic analyses of AI-enabled DR screening: the importance of contextualised parameters including subgroup analysis, real-world data on adherence to ophthalmology follow-up, and the trade-off between diagnostic accuracy and cost-effectiveness. 39% of studies followed standardised reporting guidelines, and most did not consider improved follow-up after AI screening, potentially underestimating its economic value. Future evaluations should incorporate contextualised parameters, including adherence and regional data, and recognise that the most accurate diagnostic screening may not reflect the most cost-effective. Studies should follow updated reporting guidelines such as CHEERS-AI or PICOTS-ComTeC to improve methodological transparency.
    Keywords:  cost‐effectiveness; deep learning; implementation science; machine learning; retinal diseases
    DOI:  https://doi.org/10.1111/ceo.70016
  5. Sci Rep. 2025 Nov 03. 15(1): 38376
      Continuous glucose monitoring (CGM) devices allow real-time glucose readings leading to improved glycemic control. However, glucose predictions in the lower (hypoglycemia) and higher (hyperglycemia) extremes, referred as glycemic excursions, remain challenging due to their rarity. Moreover, limited access to sensitive patient data hampers the development of robust machine learning models even with advanced deep learning algorithms available. We propose to simultaneously provide accurate glucose predictions in the excursion regions while addressing data privacy concerns. To tackle excursion prediction, we propose a novel Hypo-Hyper (HH) loss function that penalizes errors based on the underlying glycemic range with a higher penalty at the extremes over the normal glucose range. On the other hand, to address privacy concerns, we propose FedGlu, a machine learning model trained in a federated learning (FL) framework. FL allows collaborative learning without sharing sensitive data by training models locally and sharing only model parameters across other patients. The HH loss combined within FedGlu addresses both the challenges at the same time. The HH loss function demonstrates a 46% improvement over mean-squared error (MSE) loss across 125 patients. Compared to local models, FedGlu improved glycemic excursion detection by 35% compared to local models. This improvement translates to enhanced performance in predicting both, hypoglycemia and hyperglycemia, for 105 out of 125 patients. These results underscore the effectiveness of the proposed HH loss function in augmenting the predictive capabilities of glucose predictions. Moreover, implementing models within a federated learning framework not only ensures better predictive capabilities but also safeguards sensitive data concurrently.
    Keywords:  Continuous glucose monitoring; Diabetes; Federated learning; Glucose prediction; Hyperglycemia; Hypoglycemia
    DOI:  https://doi.org/10.1038/s41598-025-22316-4
  6. Int J Med Inform. 2025 Oct 18. pii: S1386-5056(25)00378-8. [Epub ahead of print]206 106161
       OBJECTIVE: Accurate prediction of type 2 diabetes mellitus (T2DM) onset is critical to enable timely interventions and preventive strategies. Although machine learning (ML) approaches have shown promise in risk prediction, their complexity often limits clinical implementation. There is a need for interpretable, user-friendly models that retain predictive strength.
    METHODS: We studied 904 cardiovascular risk patients without T2DM at baseline, assessing 71 anthropometric, clinical, and laboratory variables. Over a four-year follow-up, 10 % developed T2DM. We applied AutoScore, an interpretable ML framework that generates parsimonious, point-based risk scores, and compared its performance with an optimized Support Vector Machine (SVM) with a linear kernel. The SVM was refined using feature selection, Tomek link removal, and up-sampling to address class imbalance.
    RESULTS: Both approaches consistently identified fasting glucose, OGTT glucose, and the Matsuda index (reflecting glucose-insulin dynamics) as key predictors. The optimized SVM model achieved a higher balanced accuracy (75 % vs. 67 %), specificity (80 % vs. 77 %), and AUC (0.72 vs. 0.69) compared to AutoScore. However, AutoScore, other than the SVM model, relied exclusively on a small set of routinely available accessible parameters and thereby offered superior interpretability and ease of integration into clinical workflows. External validation in an independent cohort further confirmed the robustness of the AutoScore model.
    CONCLUSION: Although black-box models such as SVM deliver slightly higher predictive accuracy, interpretable frameworks like AutoScore provide clinically actionable risk stratification based on standard data. Their transparency and simplicity make them particularly valuable for real-world decision support.
    Keywords:  Artificial intelligence; Biomarker; Cardiovascular risk; Diabetes incidence; Machine learning; Risk prediction
    DOI:  https://doi.org/10.1016/j.ijmedinf.2025.106161
  7. Front Endocrinol (Lausanne). 2025 ;16 1689312
       Background: Type 2 diabetes mellitus (T2DM) is a highly prevalent non-communicable chronic disease that substantially reduces life expectancy. Accurate estimation of all-cause mortality risk in T2DM patients is crucial for personalizing and optimizing treatment strategies.
    Methods: This study analyzed a cohort of 554 patients (aged 40-87 years) with diagnosed T2DM over a maximum follow-up period of 16.8 years, during which 202 patients (36%) died. Key survival-associated features were identified, and multiple machine learning (ML) models were trained and validated to predict all-cause mortality risk. To improve model interpretability, Shapley additive explanations (SHAP) was applied to the best-performing model.
    Results: The extra survival trees (EST) model, incorporating ten key features, demonstrated the best predictive performance. The model achieved a C-statistic of 0.776, with the area under the receiver operating characteristic curve (AUC) values of 0.86, 0.80, 0.841, and 0.826 for 5-, 10-, 15-, and 16.8-year all-cause mortality predictions, respectively. The SHAP approach was employed to interpret the model's individual decision-making processes.
    Conclusion: The developed model exhibited strong predictive performance for mortality risk assessment. Its clinically interpretable outputs enable potential bedside application, improving the identification of high-risk patients and supporting timely treatment optimization.
    Keywords:  all-cause mortality risk; explainable artificial intelligence; machine learning; predictive model; type 2 diabetes
    DOI:  https://doi.org/10.3389/fendo.2025.1689312
  8. Sci Rep. 2025 Nov 05. 15(1): 38720
      Diabetes mellitus is a major global health burden, and early identification of insulin dependency is important for timely intervention. This study developed an artificial intelligence-based diagnostic system using a real-world clinical dataset of 100 anonymized patient records, collected with ethical approval and informed consent. The dataset included demographic, lifestyle, and biochemical variables such as glycated hemoglobin (HbA1c), fasting blood sugar (FBS), and postprandial blood sugar (PPBS). After preprocessing to handle missing values, normalize continuous variables, and encode categorical features, four machine learning models were implemented: Logistic Regression, Random Forest, XGBoost, and LightGBM, along with ensemble based, combined approaches. Model evaluation was performed using 5-fold cross-validation with accuracy, precision, recall, and F1-score as metrics. XGBoost achieved the highest performance (accuracy 0.88, precision 0.86, recall 0.90, F1-score 0.88), followed by LightGBM (accuracy 0.85, F1-score 0.84), Random Forest (accuracy 0.82, F1-score 0.81), and Logistic Regression (accuracy 0.76, F1-score 0.74). The most predictive features were PPBS and HbA1c, consistent with clinical understanding. While results are promising, they reflect a single-center dataset of 100 records, and should be interpreted as preliminary, further study will include external validation on larger, multi-site cohorts prior to clinical adoption.
    Keywords:  Diabetes; Ensemble learning; Insulin dependency; LightGBM; Machine learning; Predictive modeling
    DOI:  https://doi.org/10.1038/s41598-025-22381-9
  9. Sci Rep. 2025 Nov 05. 15(1): 38702
      We aimed to construct and validate interpretable models for predicting mortality risk using machine learning (ML) methods to identify the risk factors associated with mortality in patients with diabetic neuropathy (DN). We selected patients from the US-based critical care database (Medical Information Mart for Intensive Care (MIMIC-IV)). Independent risk factors for in-hospital death were screened using Least Absolute Shrinkage and Selection Operator (LASSO). Subsequently, we constructed mortality risk prediction models utilizing random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and logistic regression (LR). Finally, we comprehensively assessed and interpreted the best-performing ML model using SHapley Additive exPlanations (SHAP) analysis. The final study enrolled 1,313 patients with DN, randomly split into training (1,050, 80%) and validation sets (263, 20%). The RF model demonstrated superior performance, with an area under the curve (AUC) of 0.780 for the validation set. According to the feature importance result, red blood cell distribution width (RDW)_mean was identified as the most influential feature. This study demonstrated the potential of leveraging ML as a viable approach for predicting mortality risk in patients with DN. An interpretable ML model exhibited strong performance that could support clinical decision-making and enhance patient prognosis to some extent.
    Keywords:  Diabetic neuropathies; MIMIC-IV database; Machine learning algorithms; Predictive learning model
    DOI:  https://doi.org/10.1038/s41598-025-22363-x
  10. J Biomed Inform. 2025 Oct 31. pii: S1532-0464(25)00174-1. [Epub ahead of print]172 104945
       INTRODUCTION: Management of type 1 Diabetes remains a significant challenge as blood glucose levels can fluctuate dramatically and are highly individual. We introduce an innovative approach that combines multimodal Large Language models (mLLMs), mechanistic modeling of individual glucose metabolism and machine learning (ML) for forecasting blood glucose levels.
    METHODS: This study uses the D1NAMO dataset (6 patients with meal images) to demonstrate mLLM integration for glucose prediction. An mLLM (Pixtral Large) was employed to estimate macronutrients from meal images, providing automated meal analysis without manual food logging. We compare three distinct approaches: (1) Baseline using only glucose dynamics and basic insulin features, (2) LastMeal providing additional information about the last meal ingested by the patient, and (3) Bézier incorporating mechanistically modeled temporal features using optimized cubic Bézier curves to model temporal impacts of individual macronutrients on blood glucose. The modeled feature impacts served as input features for a LightGBM model. We also validate the mechanistic modeling component on the AZT1D dataset (24 patients with structured carbohydrate and correction insulin logs).
    RESULTS: The Bézier approach achieved the best performance across both datasets: D1NAMO RMSE of 15.06 at 30 min and 28.15 at 60 min; AZT1D RMSE of 16.61 at 30 min and 24.58 at 60 min. One-way ANOVA revealed statistically significant differences across prediction horizons of 45 to 120 min for the AZT1D dataset. Patient-specific Bézier curves revealed distinct metabolic response patterns: simple sugars peaked at 0.74 h, complex sugars at 3.07 h, and proteins at 4.36 h post-ingestion. Feature importance analysis showed temporal evolution from glucose change dominance to macronutrient prominence at longer horizons. Patient-specific modeling uncovered individual metabolic signatures with varying nutritional sensitivity and circadian influences.
    CONCLUSION: This study demonstrates the potential of combining mLLMs with mechanistic modeling for personalized diabetes management. The optimized Bézier curve approach provides superior temporal mapping while patient-specific models reveal individual metabolic signatures essential for personalized care.
    Keywords:  Health monitoring; Large language models; Machine learning; Mechanistic modeling; Multimodality; Type 1 diabetes
    DOI:  https://doi.org/10.1016/j.jbi.2025.104945
  11. Anal Chem. 2025 Nov 07.
      Diabetes mellitus (DM), a prevalent metabolic disorder, poses significant diagnostic and therapeutic challenges, especially, in the early stage diagnosis of diabetes related complications. Accurate early stage diagnosis of diabetes and its complications is essential for preventing chronic health problems, improving treatment outcomes, and reducing healthcare costs by enabling timely medical interventions and personalized ailment management strategies. However, conventional diagnostic techniques often face challenges to offer the required sensitivity and accuracy that are essential for early stage detection and classification of diabetes and its complications. In this context, we developed an advanced diagnostic model that utilized gold nanoparticles (AuNPs) functionalized with 4-mercaptophenylboronic acid (AuNPs@4MPBA) to enable both specific and nonspecific SERS detection of diabetes biomarkers in serum samples. In this study, the artificial intelligence (AI)-assisted self-calibration method was smartly integrated within the SERS-based diagnostic system to enable efficient early stage detection of diabetes and its associated complications. This combined diagnostic approach, demonstrating high diagnostic accuracy required only minimal sample volumes for the implementation unlike conventional diagnostic methods. Essentially, a ResNet-LSTM multihead self-attention neural network, integrated with the self-calibrating SERS technique facilitated the precise classification as well as detection of diabetes and related complications. Unlike the conventional diagnostic methods with limited scope of tracking the postmedication complications, the present self-calibrating SERS-AI combined diagnostic method provided accurate reliable diagnosis of the diabetic patients even with the premedication history. Furthermore, the incorporation of cosine similarity and Pearson's correlation methods ascertained the generalization and improved accuracy of the diagnostic model, apart from limiting the scope of clinical misdiagnosis.
    DOI:  https://doi.org/10.1021/acs.analchem.5c04281
  12. Diabetol Metab Syndr. 2025 Nov 06. 17(1): 419
       BACKGROUND: The AI-CVD initiative aims to maximize the value of coronary artery calcium (CAC) scans for cardiometabolic risk prediction by extracting opportunistic screening information. We investigated whether artificial intelligence (AI)-derived measures from CAC scans are associated with new-onset Type 2 diabetes mellitus (T2DM) in adults without obesity or hyperglycemia.
    METHODS: Baseline CAC scans and up to 23 years of follow-up data were analyzed for participants without obesity (body mass index < 30 kg/m²) and hyperglycemia (fasting plasma glucose < 100 mg/dL) from the Multi-Ethnic Study of Atherosclerosis (MESA). AI-derived measures included liver attenuation index (LAI), subcutaneous fat index (SFI), total visceral fat index (TVFI), epicardial fat index (EFI), skeletal muscle index, and skeletal muscle mean density. Cox regression models compared highest vs. lowest quartiles of each AI-derived metric for T2DM risk. Multivariable models assessed adjusted predictive value using Wald chi-squared statistics. Subgroup analyses stratified participants by demographic and clinical factors.
    RESULTS: During a median follow-up of 19.7 years among 2,993 participants (baseline mean age 61.9 ± 10.5 years, 53% women), 257 participants (8.6%) developed T2DM. Key predictors included LAI (HR: 3.13, 95% CI: 2.15-4.55), SFI (HR: 2.85, 95% CI: 1.93-4.21), TVFI (HR: 2.49, 95% CI: 1.72-3.60), and EFI (HR: 1.59, 95% CI: 1.09-2.32). LAI remained the most robust predictor after adjusting for all metrics (Wald χ² = 38.24). Subgroup analyses confirmed LAI's consistent predictive performance.
    CONCLUSION: AI-derived adiposity measures from CAC scans-especially liver fat-can identify adults without obesity or hyperglycemia at elevated risk for developing T2DM. These findings underscore the potential of AI-enabled opportunistic screening during CAC imaging to support early T2DM risk stratification in individuals not captured by current clinical guidelines.
    Keywords:  AI-CVD; Artificial intelligence; Coronary artery calcium (CAC) scans; Opportunistic screening; Type 2 diabetes mellitus
    DOI:  https://doi.org/10.1186/s13098-025-01970-8
  13. Sci Rep. 2025 Nov 04. 15(1): 38546
      We aimed to identify and validate key predictive factors influencing 28-day survival rates in patients with diabetes and sepsis and to develop a predictive model based on these factors to assist clinical decision-making. In this retrospective cohort study, we examined data from 303 patients with diabetes and sepsis treated at the Emergency Department of West China Hospital, Sichuan University, between June 2022 and November 2023. The Least Absolute Shrinkage and Selection Operator (LASSO) method was employed to identify key predictive factors from 52 characteristics. A logistic regression model was then developed to create a nomogram for predicting 28-day survival rates. Model performance was assessed using calibration curves, Harrell's C-index, bootstrap validation, decision curve analysis, and receiver operating characteristic (ROC) curve analysis. Six major predictive factors were identified: age, consciousness level, acid-base balance (pH level), aspartate aminotransferase (AST) level, myoglobin concentration, and the need for mechanical ventilation. The nomogram exhibited excellent concordance with the calibration curve, achieving a C-index of 0.833 and demonstrating robust discriminative capability, as validated through bootstrapping. Decision curve analysis indicated that the model provided a greater net benefit within a patient survival probability threshold ranging from 20% to 80%. ROC curve analysis revealed an area under the curve of 0.833, highlighting the model's strong discriminatory power. The predictive model developed in this study for the 28-day survival rate of patients with diabetes and sepsis demonstrates high predictive accuracy and serves as an effective clinical decision-making tool for healthcare professionals.
    Keywords:  Clinical model; Diabetes; LASSO regression; Machine learning; Prognostic assessment; Sepsis
    DOI:  https://doi.org/10.1038/s41598-025-22488-z