Score: 2

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Published: October 8, 2025 | arXiv ID: 2510.06961v2

By: Vaibhav Srivastav , Steven Zheng , Eric Bezzam and more

BigTech Affiliations: Hugging Face

Potential Business Impact:

Helps computers understand spoken words faster and better.

Business Areas:

Speech Recognition Data and Analytics, Software

Despite rapid progress, ASR evaluation remains saturated with short-form English, and efficiency is rarely reported. We present the Open ASR Leaderboard, a fully reproducible benchmark and interactive leaderboard comparing 60+ open-source and proprietary systems across 11 datasets, including dedicated multilingual and long-form tracks. We standardize text normalization and report both word error rate (WER) and inverse real-time factor (RTFx), enabling fair accuracy-efficiency comparisons. For English transcription, Conformer encoders paired with LLM decoders achieve the best average WER but are slower, while CTC and TDT decoders deliver much better RTFx, making them attractive for long-form and offline use. Whisper-derived encoders fine-tuned for English improve accuracy but often trade off multilingual coverage. All code and dataset loaders are open-sourced to support transparent, extensible evaluation.

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Computation and Language

Compares voice-to-text tools for speed and accuracy.

8 Oct 2025 2

88%

Benchmarking Automatic Speech Recognition Models for African Languages

Computation and Language

Helps computers understand many African languages.

30 Nov 2025 1

88%

The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties

Computation and Language

Helps computers understand many languages and accents.

8 Sep 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

5 pages

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Helps computers understand spoken words faster and better.

Technical Abstract

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Benchmarking Automatic Speech Recognition Models for African Languages

The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties