OFF-CLIP: Improving Normal Detection Confidence in Radiology CLIP with Simple Off-Diagonal Term Auto-Adjustment
By: Junhyun Park , Chanyu Moon , Donghwan Lee and more
Potential Business Impact:
Finds hidden problems in X-rays better.
Contrastive Language-Image Pre-Training (CLIP) has enabled zero-shot classification in radiology, reducing reliance on manual annotations. However, conventional contrastive learning struggles with normal case detection due to its strict intra-sample alignment, which disrupts normal sample clustering and leads to high false positives (FPs) and false negatives (FNs). To address these issues, we propose OFF-CLIP, a contrastive learning refinement that improves normal detection by introducing an off-diagonal term loss to enhance normal sample clustering and applying sentence-level text filtering to mitigate FNs by removing misaligned normal statements from abnormal reports. OFF-CLIP can be applied to radiology CLIP models without requiring any architectural modifications. Experimental results show that OFF-CLIP significantly improves normal classification, achieving a 0.61 Area under the curve (AUC) increase on VinDr-CXR over CARZero, the state-of-the-art zero-shot classification baseline, while maintaining or improving abnormal classification performance. Additionally, OFF-CLIP enhances zero-shot grounding by improving pointing game accuracy, confirming better anomaly localization. These results demonstrate OFF-CLIP's effectiveness as a robust and efficient enhancement for medical vision-language models.
Similar Papers
AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP
CV and Pattern Recognition
Finds hidden problems in pictures better.
AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection
CV and Pattern Recognition
Finds weird things in pictures without training.
Enhancing zero-shot learning in medical imaging: integrating clip with advanced techniques for improved chest x-ray analysis
CV and Pattern Recognition
Helps doctors find lung problems on X-rays.