MuDRiC: Multi-Dialect Reasoning for Arabic Commonsense Validation
By: Kareem Elozeiri , Mervat Abassy , Preslav Nakov and more
Potential Business Impact:
Helps computers understand Arabic stories better.
Commonsense validation evaluates whether a sentence aligns with everyday human understanding, a critical capability for developing robust natural language understanding systems. While substantial progress has been made in English, the task remains underexplored in Arabic, particularly given its rich linguistic diversity. Existing Arabic resources have primarily focused on Modern Standard Arabic (MSA), leaving regional dialects underrepresented despite their prevalence in spoken contexts. To bridge this gap, we present two key contributions: (i) we introduce MuDRiC, an extended Arabic commonsense dataset incorporating multiple dialects, and (ii) a novel method adapting Graph Convolutional Networks (GCNs) to Arabic commonsense reasoning, which enhances semantic relationship modeling for improved commonsense validation. Our experimental results demonstrate that this approach achieves superior performance in Arabic commonsense validation. Our work enhances Arabic natural language understanding by providing both a foundational dataset and a novel method for handling its complex variations. To the best of our knowledge, we release the first Arabic multi-dialect commonsense reasoning dataset.
Similar Papers
Commonsense Reasoning in Arab Culture
Computation and Language
Teaches computers Arabic culture for better understanding.
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
Artificial Intelligence
Teaches computers about different cultures easily.
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models
Computation and Language
Tests if computers understand different Arabic languages.