TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories
By: Kirti Bhagat , Shaily Bhatt , Athul Velagapudi and more
Potential Business Impact:
Finds AI stories often get Indian cultures wrong.
Millions of users across the globe turn to AI chatbots for their creative needs, inviting widespread interest in understanding how such chatbots represent diverse cultures. At the same time, evaluating cultural representations in open-ended tasks remains challenging and underexplored. In this work, we present TALES, an evaluation of cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities. First, we develop TALES-Tax, a taxonomy of cultural misrepresentations by collating insights from participants with lived experiences in India through focus groups (N=9) and individual surveys (N=15). Using TALES-Tax, we evaluate 6 models through a large-scale annotation study spanning 2,925 annotations from 108 annotators with lived cultural experience from across 71 regions in India and 14 languages. Concerningly, we find that 88\% of the generated stories contain one or more cultural inaccuracies, and such errors are more prevalent in mid- and low-resourced languages and stories based in peri-urban regions in India. Lastly, we transform the annotations into TALES-QA, a standalone question bank to evaluate the cultural knowledge of foundational models. Through this evaluation, we surprisingly discover that models often possess the requisite cultural knowledge despite generating stories rife with cultural misrepresentations.
Similar Papers
Biased Tales: Cultural and Topic Bias in Generating Children's Stories
Computation and Language
AI stories show unfair gender and culture bias.
Topic-aware Large Language Models for Summarizing the Lived Healthcare Experiences Described in Health Stories
Computers and Society
Helps doctors understand patient stories better.
Invisible Filters: Cultural Bias in Hiring Evaluations Using Large Language Models
Computers and Society
AI hiring tools unfairly judge people from different countries.