Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity
By: Shuai Dong , Jie Zhang , Guoying Zhao and more
Potential Business Impact:
Stops bad edits on pictures.
Text-guided image editing via diffusion models, while powerful, raises significant concerns about misuse, motivating efforts to immunize images against unauthorized edits using imperceptible perturbations. Prevailing metrics for evaluating immunization success typically rely on measuring the visual dissimilarity between the output generated from a protected image and a reference output generated from the unprotected original. This approach fundamentally overlooks the core requirement of image immunization, which is to disrupt semantic alignment with attacker intent, regardless of deviation from any specific output. We argue that immunization success should instead be defined by the edited output either semantically mismatching the prompt or suffering substantial perceptual degradations, both of which thwart malicious intent. To operationalize this principle, we propose Synergistic Intermediate Feature Manipulation (SIFM), a method that strategically perturbs intermediate diffusion features through dual synergistic objectives: (1) maximizing feature divergence from the original edit trajectory to disrupt semantic alignment with the expected edit, and (2) minimizing feature norms to induce perceptual degradations. Furthermore, we introduce the Immunization Success Rate (ISR), a novel metric designed to rigorously quantify true immunization efficacy for the first time. ISR quantifies the proportion of edits where immunization induces either semantic failure relative to the prompt or significant perceptual degradations, assessed via Multimodal Large Language Models (MLLMs). Extensive experiments show our SIFM achieves the state-of-the-art performance for safeguarding visual content against malicious diffusion-based manipulation.
Similar Papers
Immunizing Images from Text to Image Editing via Adversarial Cross-Attention
CV and Pattern Recognition
Makes AI image editing fooled by fake descriptions.
MdaIF: Robust One-Stop Multi-Degradation-Aware Image Fusion with Language-Driven Semantics
CV and Pattern Recognition
Cleans up blurry pictures from bad weather.
Towards Transferable Defense Against Malicious Image Edits
CV and Pattern Recognition
Stops bad edits from changing pictures.