Score: 1

Scaling HuBERT for African Languages: From Base to Large and XL

Published: November 28, 2025 | arXiv ID: 2511.23370v1

By: Antoine Caubrière, Elodie Gauthier

Potential Business Impact:

Makes computers understand many African languages better.

Business Areas:
Big Data Data and Analytics

Despite recent progress in multilingual speech processing, African languages remain under-represented in both research and deployed systems, particularly when it comes to strong, open-weight encoders that transfer well under low-resource supervision. Self-supervised learning has proven especially promising in such settings, yet most publicly released models targeting African speech remain at BASE scale, leaving unanswered whether larger encoders, trained exclusively on Africa-centric audio, offer tangible benefits and how model capacity interacts with data composition. This work addresses that gap by introducing SSA-HuBERT-Large (317M parameters) and SSA-HuBERT-XL (964M parameters), the first large models trained solely on African speech, alongside a BASE size counterpart. We release these models as open weights: see https://huggingface.co/collections/Orange/african-speech-foundation-models. By conducting a carefully controlled experimental study focused exclusively on Sub-Saharan languages, covering automatic speech recognition (ASR) and language identification (LID) tasks, we demonstrate that larger architectures significantly improve performance by effectively leveraging large audio datasets.

Repos / Data Links

Page Count
2 pages

Category
Computer Science:
Computation and Language