FUTransUNet-GradCAM: A Hybrid Transformer-U-Net with Self-Attention and Explainable Visualizations for Foot Ulcer Segmentation
By: Akwasi Asare, Mary Sagoe, Justice Williams Asare
Potential Business Impact:
Helps doctors find foot ulcers faster.
Automated segmentation of diabetic foot ulcers (DFUs) plays a critical role in clinical diagnosis, therapeutic planning, and longitudinal wound monitoring. However, this task remains challenging due to the heterogeneous appearance, irregular morphology, and complex backgrounds associated with ulcer regions in clinical photographs. Traditional convolutional neural networks (CNNs), such as U-Net, provide strong localization capabilities but struggle to model long-range spatial dependencies due to their inherently limited receptive fields. To address this, we propose FUTransUNet, a hybrid architecture that integrates the global attention mechanism of Vision Transformers (ViTs) into the U-Net framework. This combination allows the model to extract global contextual features while maintaining fine-grained spatial resolution through skip connections and an effective decoding pathway. We trained and validated FUTransUNet on the public Foot Ulcer Segmentation Challenge (FUSeg) dataset. FUTransUNet achieved a training Dice Coefficient of 0.8679, an IoU of 0.7672, and a training loss of 0.0053. On the validation set, the model achieved a Dice Coefficient of 0.8751, an IoU of 0.7780, and a validation loss of 0.009045. To ensure clinical transparency, we employed Grad-CAM visualizations, which highlighted model focus areas during prediction. These quantitative outcomes clearly demonstrate that our hybrid approach successfully integrates global and local feature extraction paradigms, thereby offering a highly robust, accurate, explainable, and interpretable solution and clinically translatable solution for automated foot ulcer analysis. The approach offers a reliable, high-fidelity solution for DFU segmentation, with implications for improving real-world wound assessment and patient care.
Similar Papers
ConMatFormer: A Multi-attention and Transformer Integrated ConvNext based Deep Learning Model for Enhanced Diabetic Foot Ulcer Classification
CV and Pattern Recognition
Finds diabetic foot sores better with AI.
Multi-Granularity Vision Fastformer with Fusion Mechanism for Skin Lesion Segmentation
CV and Pattern Recognition
Helps doctors find skin spots faster and cheaper.
Deep Skin Lesion Segmentation with Transformer-CNN Fusion: Toward Intelligent Skin Cancer Analysis
Image and Video Processing
Finds skin cancer spots more accurately.