Seeing Twice: How Side-by-Side T2I Comparison Changes Auditing Strategies
By: Matheus Kunzler Maldaner , Wesley Hanwen Deng , Jason I. Hong and more
Potential Business Impact:
Helps find bad AI pictures by comparing them.
While generative AI systems have gained popularity in diverse applications, their potential to produce harmful outputs limits their trustworthiness and utility. A small but growing line of research has explored tools and processes to better engage non-AI expert users in auditing generative AI systems. In this work, we present the design and evaluation of MIRAGE, a web-based tool exploring a "contrast-first" workflow that allows users to pick up to four different text-to-image (T2I) models, view their images side-by-side, and provide feedback on model performance on a single screen. In our user study with fifteen participants, we used four predefined models for consistency, with only a single model initially being shown. We found that most participants shifted from analyzing individual images to general model output patterns once the side-by-side step appeared with all four models; several participants coined persistent "model personalities" (e.g., cartoonish, saturated) that helped them form expectations about how each model would behave on future prompts. Bilingual participants also surfaced a language-fidelity gap, as English prompts produced more accurate images than Portuguese or Chinese, an issue often overlooked when dealing with a single model. These findings suggest that simple comparative interfaces can accelerate bias discovery and reshape how people think about generative models.
Similar Papers
MIRAGE: Multi-model Interface for Reviewing and Auditing Generative Text-to-Image AI
Human-Computer Interaction
Helps people find bad AI pictures by comparing them.
Text to Image Generation and Editing: A Survey
CV and Pattern Recognition
Creates pictures from words.
Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
CV and Pattern Recognition
Tests how well AI makes pictures from words.