Photodiagnosis Photodyn Ther. 2026 Mar 25. pii: S1572-1000(26)00122-5. [Epub ahead of print]
105455
OBJECTIVE: To develop and evaluate a transformer-enabled multi-task framework for automated diabetic retinopathy (DR) analysis, including lesion-level segmentation and detection, and to compare end-to-end vision transformers with radiomics-based classification for DR severity grading across multi-center datasets.
MATERIALS AND METHODS: A total of 987 fundus images from two clinical centers were used for lesion segmentation and detection, and 6,852 images were used for four-class DR severity classification, with rigorous inclusion criteria and expert-verified annotations. Preprocessing included CLAHE normalization, artifact filtering, and standardized retinal masking. Four segmentation models (UNet++, CE-Net, Swin-UNet, SegFormer) and four detection models (RetinaNet, YOLOv11, DETR, Deformable DETR) were trained under harmonized settings. Classification was performed using two strategies: (1) an end-to-end Vision Transformer (ViT), and (2) a radiomics-based pipeline incorporating 971 IBSI-compliant radiomic features, ComBat harmonization, and three feature-selection methods (SGR, TES, mRMR) paired with six classifiers (CatBoost, LightGBM, TabPFN, SVM, RF, LR). All models underwent internal cross-validation and external multi-center testing.
RESULTS: SegFormer achieved the highest segmentation performance, with Dice scores of 0.871-0.963 across lesions and strong external generalization. Deformable DETR achieved the best detection performance, reaching external mAP values up to 0.895. For severity classification, the radiomics-based TES + TabPFN pipeline achieved the best results, reaching an external accuracy of 0.883 and an AUC of 0.947, outperforming the ViT classifier (accuracy 0.838, AUC 0.902). Radiomics models demonstrated superior robustness under domain shift and reduced sensitivity to training-set size compared with end-to-end transformers.
CONCLUSIONS: Transformer-based lesion analysis combined with radiomics-driven classification provides a robust, generalizable, and clinically meaningful solution for automated DR screening and severity assessment.
Keywords: Deep learning; Diabetic retinopathy; Fundus imaging; Multi-lesion segmentation; Severity classification; Transformer models