From Unlearning to UNBRANDING: A Benchmark for Trademark-Safe Text-to-Image Generation
By: Dawid Malarz , Artur Kasymov , Filip Manjak and more
The rapid progress of text-to-image diffusion models raises significant concerns regarding the unauthorized reproduction of trademarked content. While prior work targets general concepts (e.g., styles, celebrities), it fails to address specific brand identifiers. Crucially, we note that brand recognition is multi-dimensional, extending beyond explicit logos to encompass distinctive structural features (e.g., a car's front grille). To tackle this, we introduce unbranding, a novel task for the fine-grained removal of both trademarks and subtle structural brand features, while preserving semantic coherence. To facilitate research, we construct a comprehensive benchmark dataset. Recognizing that existing brand detectors are limited to logos and fail to capture abstract trade dress (e.g., the shape of a Coca-Cola bottle), we introduce a novel evaluation metric based on Vision Language Models (VLMs). This VLM-based metric uses a question-answering framework to probe images for both explicit logos and implicit, holistic brand characteristics. Furthermore, we observe that as model fidelity increases, with newer systems (SDXL, FLUX) synthesizing brand identifiers more readily than older models (Stable Diffusion), the urgency of the unbranding challenge is starkly highlighted. Our results, validated by our VLM metric, confirm unbranding is a distinct, practically relevant problem requiring specialized techniques. Project Page: https://gmum.github.io/UNBRANDING/.
Similar Papers
UniFusion: Vision-Language Model as Unified Encoder in Image Generation
CV and Pattern Recognition
Makes pictures match words better for editing.
UniModel: A Visual-Only Framework for Unified Multimodal Understanding and Generation
CV and Pattern Recognition
Makes computers see and create pictures from words.
Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild
CV and Pattern Recognition
Helps computers understand emotions from faces and voices.