"The Dentist is an involved parent, the bartender is not": Revealing Implicit Biases in QA with Implicit BBQ
By: Aarushi Wagh, Saniya Srivastava
Potential Business Impact:
Finds hidden unfairness in AI language.
Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the "sexual orientation" subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.
Similar Papers
PakBBQ: A Culturally Adapted Bias Benchmark for QA
Computation and Language
Makes AI fairer for people speaking different languages.
BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context
Computation and Language
Tests AI for unfairness in Indian languages.
PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
Computation and Language
Helps computers understand Persian culture without bias.