A HYBRID CAD APPROACH USING VGG-16, RAS-UNET, AND CNN-MGCA FOR MICROANEURYSM DETECTION IN DIABETIC RETINOPATHY

ICTACT Journal on Image and Video Processing ( Volume: 16 , Issue: 4 )

Abstract

Detecting red lesions in color retinal fundus images is essential for preventing vision loss and blindness in people with diabetic retinopathy (DR). Among these lesions, microaneurysms (MAs) are the earliest and most common indicators of DR, making their identification particularly important for effective large-scale screening programs. However, accurately spotting MAs is challenging due to low contrast and varying image quality across different imaging conditions. To overcome these challenges, computer-aided diagnostic (CAD) systems powered by deep learning have shown immense potential for supporting timely and precise diagnosis. In this study, we propose a comprehensive CAD framework that combines advanced deep learning models to improve both detection and classification of retinal abnormalities. Our method begins by enhancing image quality—reducing noise, improving clarity, and standardizing image size to ensure consistent input for downstream analysis. We then differentiate between healthy and DR-affected retinas using a VGG-16 network enhanced with a Spatial Pyramid Pooling (SPP) layer to extract rich and meaningful features. These features are then fed into an Extreme Gradient Boosting (XGBoost) classifier, which separates normal from diseased cases. Next, to locate potential microaneurysms, we employ a Residual U-Net architecture with atrous depthwise separable convolutions (RAS-UNet). This model consists of an encoder, an atrous convolution module, and a decoder. The atrous module combines cascaded and parallel operations to capture features at multiple scales, enabling more reliable detection of MAs of different sizes. Finally, we refine the results by passing candidate regions through a Convolutional Neural Network with MGCA (CNN-MGCA) to distinguish true microaneurysms from false positives. We evaluated our system using a range of performance metrics, including accuracy, AUC, sensitivity, specificity, positive predictive value (PPV), F1-score, and FROC analysis. Overall, our experimental results demonstrate that the proposed approach outperforms existing methods reported in the literature, offering a promising tool for large-scale automated diabetic retinopathy screening and early intervention.

Authors

S. Steffia, D. Murugan
Manonmaniam Sundaranar University, India

Keywords

RAS-UNet, CNN-MGCA, Spatial Pyramid Pooling (SPP), Extreme Gradient Boosting (XGBoost)

Published By
ICTACT
Published In
ICTACT Journal on Image and Video Processing
( Volume: 16 , Issue: 4 )
Date of Publication
May 2026
Pages
3947 - 3960
Page Views
54
Full Text Views
6