Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs
By: Fakhraddin Alwajih , Abdellah El Mekki , Samar Mohamed Magdy and more
Potential Business Impact:
Teaches computers to understand Arabic culture and dialects.
As large language models (LLMs) become increasingly integrated into daily life, ensuring their cultural sensitivity and inclusivity is paramount. We introduce our dataset, a year-long community-driven project covering all 22 Arab countries. The dataset includes instructions (input, response pairs) in both Modern Standard Arabic (MSA) and dialectal Arabic (DA), spanning 20 diverse topics. Built by a team of 44 researchers across the Arab world, all of whom are authors of this paper, our dataset offers a broad, inclusive perspective. We use our dataset to evaluate the cultural and dialectal capabilities of several frontier LLMs, revealing notable limitations. For instance, while closed-source LLMs generally exhibit strong performance, they are not without flaws, and smaller open-source models face greater challenges. Moreover, certain countries (e.g., Egypt, the UAE) appear better represented than others (e.g., Iraq, Mauritania, Yemen). Our annotation guidelines, code, and data for reproducibility are publicly available.
Similar Papers
PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture
Computation and Language
Teaches computers about Arab and Islamic cultures.
SaudiCulture: A Benchmark for Evaluating Large Language Models Cultural Competence within Saudi Arabia
Computation and Language
Helps computers understand Saudi culture better.
LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones?
Computation and Language
Helps AI understand all Arab cultures, not just one.