MineTheGap: Automatic Mining of Biases in Text-to-Image Models
By: Noa Cohen , Nurit Spingarn-Eliezer , Inbar Huberman-Spiegelglas and more
Potential Business Impact:
Finds and fixes unfair picture-making by computers.
Text-to-Image (TTI) models generate images based on text prompts, which often leave certain aspects of the desired image ambiguous. When faced with these ambiguities, TTI models have been shown to exhibit biases in their interpretations. These biases can have societal impacts, e.g., when showing only a certain race for a stated occupation. They can also affect user experience when creating redundancy within a set of generated images instead of spanning diverse possibilities. Here, we introduce MineTheGap - a method for automatically mining prompts that cause a TTI model to generate biased outputs. Our method goes beyond merely detecting bias for a given prompt. Rather, it leverages a genetic algorithm to iteratively refine a pool of prompts, seeking for those that expose biases. This optimization process is driven by a novel bias score, which ranks biases according to their severity, as we validate on a dataset with known biases. For a given prompt, this score is obtained by comparing the distribution of generated images to the distribution of LLM-generated texts that constitute variations on the prompt. Code and examples are available on the project's webpage.
Similar Papers
Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search
Machine Learning (CS)
Finds hidden unfairness in AI art.
Prompting Away Stereotypes? Evaluating Bias in Text-to-Image Models for Occupations
Computation and Language
Makes AI art show different kinds of people.
AutoDebias: Automated Framework for Debiasing Text-to-Image Models
CV and Pattern Recognition
Fixes AI art to show everyone fairly.