Open Automatic Speech Recognition Models for Classical and Modern Standard Arabic
By: Lilit Grigoryan , Nikolay Karpov , Enas Albasiri and more
Potential Business Impact:
Helps computers understand spoken Arabic better.
Despite Arabic being one of the most widely spoken languages, the development of Arabic Automatic Speech Recognition (ASR) systems faces significant challenges due to the language's complexity, and only a limited number of public Arabic ASR models exist. While much of the focus has been on Modern Standard Arabic (MSA), there is considerably less attention given to the variations within the language. This paper introduces a universal methodology for Arabic speech and text processing designed to address unique challenges of the language. Using this methodology, we train two novel models based on the FastConformer architecture: one designed specifically for MSA and the other, the first unified public model for both MSA and Classical Arabic (CA). The MSA model sets a new benchmark with state-of-the-art (SOTA) performance on related datasets, while the unified model achieves SOTA accuracy with diacritics for CA while maintaining strong performance for MSA. To promote reproducibility, we open-source the models and their training recipes.
Similar Papers
Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning
Artificial Intelligence
Lets computers understand Arabic speech without human help.
Arabic ASR on the SADA Large-Scale Arabic Speech Corpus with Transformer-Based Models
Audio and Speech Processing
Helps computers understand different Arabic accents better.
Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning
Computation and Language
Helps computers understand many Arabic accents.