Navigating the Trade-off: A Synthesis of Defensive Strategies for Zero-Shot Adversarial Robustness in Vision-Language Models
By: Zane Xu, Jason Sun
Potential Business Impact:
Makes AI understand pictures even when tricked.
This report synthesizes eight seminal papers on the zero-shot adversarial robustness of vision-language models (VLMs) like CLIP. A central challenge in this domain is the inherent trade-off between enhancing adversarial robustness and preserving the model's zero-shot generalization capabilities. We analyze two primary defense paradigms: Adversarial Fine-Tuning (AFT), which modifies model parameters, and Training-Free/Test-Time Defenses, which preserve them. We trace the evolution from alignment-preserving methods (TeCoA) to embedding space re-engineering (LAAT, TIMA), and from input heuristics (AOM, TTC) to latent-space purification (CLIPure). Finally, we identify key challenges and future directions including hybrid defense strategies and adversarial pre-training.
Similar Papers
Enhancing CLIP Robustness via Cross-Modality Alignment
CV and Pattern Recognition
Protects AI from tricky fake pictures.
Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks
CV and Pattern Recognition
Finds and fixes hidden "bad code" in AI.
Semantically Guided Adversarial Testing of Vision Models Using Language Models
CV and Pattern Recognition
Makes AI models more easily fooled.