From Scarcity to Efficiency: Investigating the Effects of Data Augmentation on African Machine Translation
By: Mardiyyah Oduwole , Oluwatosin Olajide , Jamiu Suleiman and more
Potential Business Impact:
Improves translation for African languages.
The linguistic diversity across the African continent presents different challenges and opportunities for machine translation. This study explores the effects of data augmentation techniques in improving translation systems in low-resource African languages. We focus on two data augmentation techniques: sentence concatenation with back translation and switch-out, applying them across six African languages. Our experiments show significant improvements in machine translation performance, with a minimum increase of 25\% in BLEU score across all six languages.We provide a comprehensive analysis and highlight the potential of these techniques to improve machine translation systems for low-resource languages, contributing to the development of more robust translation systems for under-resourced languages.
Similar Papers
Data Augmentation and Hyperparameter Tuning for Low-Resource MFA
Computation and Language
Improves computer understanding of rare languages.
Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda
Computation and Language
Improves computer translation for rare languages.
A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text Analytics
Computation and Language
Helps computers understand many languages better.