DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification
By: Guohao Huo , Zibo Lin , Zitong Wang and more
Potential Business Impact:
Helps doctors find eye sickness by comparing both eyes.
Ophthalmic diseases pose a significant global health burden. However, traditional diagnostic methods and existing monocular image-based deep learning approaches often overlook the pathological correlations between the two eyes. In practical medical robotic diagnostic scenarios, paired retinal images (binocular fundus images) are frequently required as diagnostic evidence. To address this, we propose DMS-Net-a dual-modal multi-scale siamese network for binocular retinal image classification. The framework employs a weight-sharing siamese ResNet-152 architecture to concurrently extract deep semantic features from bilateral fundus images. To tackle challenges like indistinct lesion boundaries and diffuse pathological distributions, we introduce the OmniPool Spatial Integrator Module (OSIM), which achieves multi-resolution feature aggregation through multi-scale adaptive pooling and spatial attention mechanisms. Furthermore, the Calibrated Analogous Semantic Fusion Module (CASFM) leverages spatial-semantic recalibration and bidirectional attention mechanisms to enhance cross-modal interaction, aggregating modality-agnostic representations of fundus structures. To fully exploit the differential semantic information of lesions present in bilateral fundus features, we introduce the Cross-Modal Contrastive Alignment Module (CCAM). Additionally, to enhance the aggregation of lesion-correlated semantic information, we introduce the Cross-Modal Integrative Alignment Module (CIAM). Evaluation on the ODIR-5K dataset demonstrates that DMS-Net achieves state-of-the-art performance with an accuracy of 82.9%, recall of 84.5%, and a Cohen's kappa coefficient of 83.2%, showcasing robust capacity in detecting symmetrical pathologies and improving clinical decision-making for ocular diseases. Code and the processed dataset will be released subsequently.
Similar Papers
MMIS-Net for Retinal Fluid Segmentation and Detection
Image and Video Processing
Helps doctors find diseases in X-rays better.
Multi-modal and Multi-view Fundus Image Fusion for Retinopathy Diagnosis via Multi-scale Cross-attention and Shifted Window Self-attention
CV and Pattern Recognition
Helps doctors find eye disease faster and better.
Dual Interaction Network with Cross-Image Attention for Medical Image Segmentation
CV and Pattern Recognition
Improves medical scans for better disease detection.