Invisible Backdoor Triggers in Image Editing Model via Deep Watermarking
By: Yu-Feng Chen , Tzuhsuan Huang , Pin-Yen Chiu and more
Potential Business Impact:
Hides secret messages in pictures that change them.
Diffusion models have achieved remarkable progress in both image generation and editing. However, recent studies have revealed their vulnerability to backdoor attacks, in which specific patterns embedded in the input can manipulate the model's behavior. Most existing research in this area has proposed attack frameworks focused on the image generation pipeline, leaving backdoor attacks in image editing relatively unexplored. Among the few studies targeting image editing, most utilize visible triggers, which are impractical because they introduce noticeable alterations to the input image before editing. In this paper, we propose a novel attack framework that embeds invisible triggers into the image editing process via poisoned training data. We leverage off-the-shelf deep watermarking models to encode imperceptible watermarks as backdoor triggers. Our goal is to make the model produce the predefined backdoor target when it receives watermarked inputs, while editing clean images normally according to the given prompt. With extensive experiments across different watermarking models, the proposed method achieves promising attack success rates. In addition, the analysis results of the watermark characteristics in term of backdoor attack further support the effectiveness of our approach. The code is available at:https://github.com/aiiu-lab/BackdoorImageEditing
Similar Papers
Diffusion-Based Image Editing: An Unforeseen Adversary to Robust Invisible Watermarks
Cryptography and Security
Makes hidden messages in pictures disappear.
Diffusion-Based Image Editing for Breaking Robust Watermarks
CV and Pattern Recognition
Breaks hidden messages in pictures using AI.
Beyond Invisibility: Learning Robust Visible Watermarks for Stronger Copyright Protection
Machine Learning (CS)
Adds hidden marks to pictures, stopping AI theft.