XAI-Driven Diagnosis of Generalization Failure in State-Space Cerebrovascular Segmentation Models: A Case Study on Domain Shift Between RSNA and TopCoW Datasets
By: Youssef Abuzeid, Shimaa El-Bana, Ahmad Al-Kabbany
The clinical deployment of deep learning models in medical imaging is severely hindered by domain shift. This challenge, where a high-performing model fails catastrophically on external datasets, is a critical barrier to trustworthy AI. Addressing this requires moving beyond simple performance metrics toward deeper understanding, making Explainable AI (XAI) an essential diagnostic tool in medical image analysis. We present a rigorous, two-phase approach to diagnose the generalization failure of state-of-the-art State-Space Models (SSMs), specifically UMamaba, applied to cerebrovascular segmentation. We first established a quantifiable domain gap between our Source (RSNA CTA Aneurysm) and Target (TopCoW Circle of Willis CT) datasets, noting significant differences in Z-resolution and background noise. The model's Dice score subsequently plummeted from 0.8604 (Source) to 0.2902 (Target). In the second phase, which is our core contribution, we utilized Seg-XRes-CAM to diagnose the cause of this failure. We quantified the model's focus by measuring the overlap between its attention maps and the Ground Truth segmentations, and between its attention maps and its own Prediction Mask. Our analysis proves the model failed to generalize because its attention mechanism abandoned true anatomical features in the Target domain. Quantitative metrics confirm the model's focus shifted away from the Ground Truth vessels (IoU~0.101 at 0.3 threshold) while still aligning with its own wrong predictions (IoU~0.282 at 0.3 threshold). This demonstrates the model learned spurious correlations, confirming XAI is a powerful diagnostic tool for identifying dataset bias in emerging architectures.
Similar Papers
Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application
Machine Learning (CS)
Helps doctors understand brain scans better.
Causal Attribution of Model Performance Gaps in Medical Imaging Under Distribution Shifts
Image and Video Processing
Finds why medical scans fail, fixing them.
Opening the black box of deep learning: Validating the statistical association between explainable artificial intelligence (XAI) and clinical domain knowledge in fundus image-based glaucoma diagnosis
CV and Pattern Recognition
Shows how computers see eye disease like doctors.