Culturally-Aware Conversations: A Framework & Benchmark for LLMs
By: Shreya Havaldar , Sunny Rai , Young-Min Cho and more
Potential Business Impact:
Helps computers talk better with people everywhere.
Existing benchmarks that measure cultural adaptation in LLMs are misaligned with the actual challenges these models face when interacting with users from diverse cultural backgrounds. In this work, we introduce the first framework and benchmark designed to evaluate LLMs in realistic, multicultural conversational settings. Grounded in sociocultural theory, our framework formalizes how linguistic style - a key element of cultural communication - is shaped by situational, relational, and cultural context. We construct a benchmark dataset based on this framework, annotated by culturally diverse raters, and propose a new set of desiderata for cross-cultural evaluation in NLP: conversational framing, stylistic sensitivity, and subjective correctness. We evaluate today's top LLMs on our benchmark and show that these models struggle with cultural adaptation in a conversational setting.
Similar Papers
Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages
Computation and Language
Makes math problems understandable in any language.
Do Large Language Models Truly Understand Cross-cultural Differences?
Computation and Language
Tests if computers understand different cultures.
MyCulture: Exploring Malaysia's Diverse Culture under Low-Resource Language Constraints
Computation and Language
Tests computers on Malaysian culture and language.