Score: 1

AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models

Published: November 27, 2025 | arXiv ID: 2511.22016v1

By: Yann Le Beux , Oluchi Audu , Oche D. Ankeli and more

BigTech Affiliations: Stanford University

Potential Business Impact:

Fixes AI bias against African people and cultures.

Business Areas:

Image Recognition Data and Analytics, Software

Existing AI bias evaluation benchmarks largely reflect Western perspectives, leaving African contexts underrepresented and enabling harmful stereotypes in applications across various domains. To address this gap, we introduce AfriStereo, the first open-source African stereotype dataset and evaluation framework grounded in local socio-cultural contexts. Through community engaged efforts across Senegal, Kenya, and Nigeria, we collected 1,163 stereotypes spanning gender, ethnicity, religion, age, and profession. Using few-shot prompting with human-in-the-loop validation, we augmented the dataset to over 5,000 stereotype-antistereotype pairs. Entries were validated through semantic clustering and manual annotation by culturally informed reviewers. Preliminary evaluation of language models reveals that nine of eleven models exhibit statistically significant bias, with Bias Preference Ratios (BPR) ranging from 0.63 to 0.78 (p <= 0.05), indicating systematic preferences for stereotypes over antistereotypes, particularly across age, profession, and gender dimensions. Domain-specific models appeared to show weaker bias in our setup, suggesting task-specific training may mitigate some associations. Looking ahead, AfriStereo opens pathways for future research on culturally grounded bias evaluation and mitigation, offering key methodologies for the AI community on building more equitable, context-aware, and globally inclusive NLP technologies.

StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings

Computation and Language

Helps computers spot harmful stereotypes and biases.

4 Apr 2025 1

88%

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Computers and Society

Finds and fixes harmful stereotypes in computer language.

28 Oct 2025 0

88%

Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs

Computation and Language

Finds hidden bias in AI language models.

3 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

21 pages

AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models

Fixes AI bias against African people and cultures.

Technical Abstract

StereoDetect: Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs