From National Curricula to Cultural Awareness: Constructing Open-Ended Culture-Specific Question Answering Dataset
By: Haneul Yoo , Won Ik Cho , Geunhye Kim and more
Potential Business Impact:
Teaches computers Korean culture for better answers.
Large language models (LLMs) achieve strong performance on many tasks, but their progress remains uneven across languages and cultures, often reflecting values latent in English-centric training data. To enable practical cultural alignment, we propose a scalable approach that leverages national social studies curricula as a foundation for culture-aware supervision. We introduce CuCu, an automated multi-agent LLM framework that transforms national textbook curricula into open-ended, culture-specific question-answer pairs. Applying CuCu to the Korean national social studies curriculum, we construct KCaQA, comprising 34.1k open-ended QA pairs. Our quantitative and qualitative analyses suggest that KCaQA covers culture-specific topics and produces responses grounded in local sociocultural contexts.
Similar Papers
CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset
Computation and Language
Helps computers answer questions about places.
The Curious Case of Curiosity across Human Cultures and LLMs
Computation and Language
Makes computers curious like people everywhere.
CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset
Computation and Language
Helps computers answer questions using text and pictures.