Score: 0

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Published: October 28, 2025 | arXiv ID: 2510.24958v1

By: Guido Ivetta , Pietro Palombini , Sofía Martinelli and more

Potential Business Impact:

Finds and fixes harmful stereotypes in computer language.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The evaluation of societal biases in NLP models is critically hindered by a glaring geo-cultural gap, as existing benchmarks are overwhelmingly English-centric and focused on U.S. demographics. This leaves regions such as Latin America severely underserved, making it impossible to adequately assess or mitigate the perpetuation of harmful regional stereotypes by language technologies. To address this gap, we introduce a new, large-scale dataset of stereotypes developed through targeted community partnerships within Latin America. Furthermore, we present a novel dynamic data collection methodology that uniquely integrates the sourcing of new stereotype entries and the validation of existing data within a single, unified workflow. This combined approach results in a resource with significantly broader coverage and higher regional nuance than static collection methods. We believe that this new method could be applicable in gathering sociocultural knowledge of other kinds, and that this dataset provides a crucial new resource enabling robust stereotype evaluation and significantly addressing the geo-cultural deficit in fairness resources for Latin America.

Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts

Social and Information Networks

Makes AI understand Latin America better.

6 Nov 2025 2

90%

SESGO: Spanish Evaluation of Stereotypical Generative Outputs

Computers and Society

Finds bias in AI that speaks Spanish.

3 Sep 2025 1

88%

IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context

Computation and Language

Finds and fixes unfairness in AI language.

3 Oct 2025 2

View PDF Login to Bookmark

Country of Origin

🇦🇷 Argentina

Page Count

13 pages

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Finds and fixes harmful stereotypes in computer language.

Technical Abstract

Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts

SESGO: Spanish Evaluation of Stereotypical Generative Outputs

IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context