LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones?
By: Amr Keleg
Potential Business Impact:
Helps AI understand all Arab cultures, not just one.
Large language models (LLMs) have the potential of being useful tools that can automate tasks and assist humans. However, these models are more fluent in English and more aligned with Western cultures, norms, and values. Arabic-specific LLMs are being developed to better capture the nuances of the Arabic language, as well as the views of the Arabs. Yet, Arabs are sometimes assumed to share the same culture. In this position paper, I discuss the limitations of this assumption and provide preliminary thoughts for how to build systems that can better represent the cultural diversity within the Arab world. The invalidity of the cultural homogeneity assumption might seem obvious, yet, it is widely adopted in developing multilingual and Arabic-specific LLMs. I hope that this paper will encourage the NLP community to be considerate of the cultural diversity within various communities speaking the same language.
Similar Papers
Large Language Models and Arabic Content: A Review
Computation and Language
Helps computers understand and use Arabic language better.
An Evaluation of Cultural Value Alignment in LLM
Computers and Society
Helps computers understand different cultures worldwide.
Tahakom LLM guidelines and receipts: from pre-training data to an Arabic LLM
Machine Learning (CS)
Helps computers understand and speak Arabic better.