Score: 0

Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models

Published: December 3, 2025 | arXiv ID: 2512.03749v1

By: Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas

Potential Business Impact:

Makes AI art fairer and less biased.

Business Areas:
Image Recognition Data and Analytics, Software

Text-to-image (T2I) diffusion models have achieved widespread success due to their ability to generate high-resolution, photorealistic images. These models are trained on large-scale datasets, like LAION-5B, often scraped from the internet. However, since this data contains numerous biases, the models inherently learn and reproduce them, resulting in stereotypical outputs. We introduce SelfDebias, a fully unsupervised test-time debiasing method applicable to any diffusion model that uses a UNet as its noise predictor. SelfDebias identifies semantic clusters in an image encoder's embedding space and uses these clusters to guide the diffusion process during inference, minimizing the KL divergence between the output distribution and the uniform distribution. Unlike supervised approaches, SelfDebias does not require human-annotated datasets or external classifiers trained for each generated concept. Instead, it is designed to automatically identify semantic modes. Extensive experiments show that SelfDebias generalizes across prompts and diffusion model architectures, including both conditional and unconditional models. It not only effectively debiases images along key demographic dimensions while maintaining the visual fidelity of the generated images, but also more abstract concepts for which identifying biases is also challenging.

Country of Origin
🇮🇳 India

Page Count
18 pages

Category
Computer Science:
CV and Pattern Recognition