Score: 0

Losing Control: Data Poisoning Attack on Guided Diffusion via ControlNet

Published: July 7, 2025 | arXiv ID: 2507.04726v1

By: Raz Lapid, Almog Dubin

Potential Business Impact:

Makes AI create bad pictures when asked nicely.

Business Areas:

Darknet Internet Services

Text-to-image diffusion models have achieved remarkable success in translating textual prompts into high-fidelity images. ControlNets further extend these models by allowing precise, image-based conditioning (e.g., edge maps, depth, pose), enabling fine-grained control over structure and style. However, their dependence on large, publicly scraped datasets -- and the increasing use of community-shared data for fine-tuning -- exposes them to stealthy data poisoning attacks. In this work, we introduce a novel data poisoning method that manipulates ControlNets to generate images containing specific content without any text triggers. By injecting poisoned samples -- each pairing a subtly triggered input with an NSFW target -- the model retains clean-prompt fidelity yet reliably produces NSFW outputs when the trigger is present. On large-scale, high-quality datasets, our backdoor achieves high attack success rate while remaining imperceptible in raw inputs. These results reveal a critical vulnerability in open-source ControlNets pipelines and underscore the need for robust data sanitization and defense mechanisms.

REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models

Cryptography and Security

Makes AI image generators create secret bad pictures.

20 Apr 2025 2

88%

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

CV and Pattern Recognition

Sneaks brand logos into AI-made pictures.

12 Mar 2025 0

88%

Minimal Impact ControlNet: Advancing Multi-ControlNet Integration

Machine Learning (CS)

Makes AI draw pictures with more control.

2 Jun 2025 1

View PDF Login to Bookmark

Page Count

8 pages

Losing Control: Data Poisoning Attack on Guided Diffusion via ControlNet

Makes AI create bad pictures when asked nicely.

Technical Abstract

REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Minimal Impact ControlNet: Advancing Multi-ControlNet Integration