Score: 0

Systematic Integration of Attention Modules into CNNs for Accurate and Generalizable Medical Image Diagnosis

Published: September 2, 2025 | arXiv ID: 2509.05343v1

By: Zahid Ullah , Minki Hong , Tahir Mahmood and more

Potential Business Impact:

Helps doctors spot sickness in medical pictures better.

Business Areas:
Image Recognition Data and Analytics, Software

Deep learning has become a powerful tool for medical image analysis; however, conventional Convolutional Neural Networks (CNNs) often fail to capture the fine-grained and complex features critical for accurate diagnosis. To address this limitation, we systematically integrate attention mechanisms into five widely adopted CNN architectures, namely, VGG16, ResNet18, InceptionV3, DenseNet121, and EfficientNetB5, to enhance their ability to focus on salient regions and improve discriminative performance. Specifically, each baseline model is augmented with either a Squeeze and Excitation block or a hybrid Convolutional Block Attention Module, allowing adaptive recalibration of channel and spatial feature representations. The proposed models are evaluated on two distinct medical imaging datasets, a brain tumor MRI dataset comprising multiple tumor subtypes, and a Products of Conception histopathological dataset containing four tissue categories. Experimental results demonstrate that attention augmented CNNs consistently outperform baseline architectures across all metrics. In particular, EfficientNetB5 with hybrid attention achieves the highest overall performance, delivering substantial gains on both datasets. Beyond improved classification accuracy, attention mechanisms enhance feature localization, leading to better generalization across heterogeneous imaging modalities. This work contributes a systematic comparative framework for embedding attention modules in diverse CNN architectures and rigorously assesses their impact across multiple medical imaging tasks. The findings provide practical insights for the development of robust, interpretable, and clinically applicable deep learning based decision support systems.

Page Count
37 pages

Category
Computer Science:
CV and Pattern Recognition