Causal-SAM-LLM: Large Language Models as Causal Reasoners for Robust Medical Segmentation
By: Tao Tang , Shijie Xu , Yiting Wu and more
Potential Business Impact:
Helps AI see medical scans better, anywhere.
The clinical utility of deep learning models for medical image segmentation is severely constrained by their inability to generalize to unseen domains. This failure is often rooted in the models learning spurious correlations between anatomical content and domain-specific imaging styles. To overcome this fundamental challenge, we introduce Causal-SAM-LLM, a novel framework that elevates Large Language Models (LLMs) to the role of causal reasoners. Our framework, built upon a frozen Segment Anything Model (SAM) encoder, incorporates two synergistic innovations. First, Linguistic Adversarial Disentanglement (LAD) employs a Vision-Language Model to generate rich, textual descriptions of confounding image styles. By training the segmentation model's features to be contrastively dissimilar to these style descriptions, it learns a representation robustly purged of non-causal information. Second, Test-Time Causal Intervention (TCI) provides an interactive mechanism where an LLM interprets a clinician's natural language command to modulate the segmentation decoder's features in real-time, enabling targeted error correction. We conduct an extensive empirical evaluation on a composite benchmark from four public datasets (BTCV, CHAOS, AMOS, BraTS), assessing generalization under cross-scanner, cross-modality, and cross-anatomy settings. Causal-SAM-LLM establishes a new state of the art in out-of-distribution (OOD) robustness, improving the average Dice score by up to 6.2 points and reducing the Hausdorff Distance by 15.8 mm over the strongest baseline, all while using less than 9% of the full model's trainable parameters. Our work charts a new course for building robust, efficient, and interactively controllable medical AI systems.
Similar Papers
Unifying Segment Anything in Microscopy with Multimodal Large Language Model
CV and Pattern Recognition
Helps computers see and understand tiny cell pictures.
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting
Machine Learning (CS)
Helps computers learn and make better choices.
Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine
Artificial Intelligence
AI finds cures by understanding why things happen.