Score: 1

The AudioMOS Challenge 2025

Published: September 1, 2025 | arXiv ID: 2509.01336v1

By: Wen-Chin Huang , Hui Wang , Cheng Liu and more

Potential Business Impact:

Makes computers judge fake sounds as good or bad.

Business Areas:

Audio Media and Entertainment, Music and Audio

This is the summary paper for the AudioMOS Challenge 2025, the very first challenge for automatic subjective quality prediction for synthetic audio. The challenge consists of three tracks. The first track aims to assess text-to-music samples in terms of overall quality and textual alignment. The second track is based on the four evaluation dimensions of Meta Audiobox Aesthetics, and the test set consists of text-to-speech, text-to-audio, and text-to-music samples. The third track focuses on synthetic speech quality assessment in different sampling rates. The challenge attracted 24 unique teams from both academia and industry, and improvements over the baselines were confirmed. The outcome of this challenge is expected to facilitate development and progress in the field of automatic evaluation for audio generation systems.