Score: 1

CodecBench: A Comprehensive Benchmark for Acoustic and Semantic Evaluation

Published: August 28, 2025 | arXiv ID: 2508.20660v1

By: Ruifan Deng , Yitian Gong , Qinghui Gao and more

Potential Business Impact:

Tests how well computers understand sounds and speech.

Business Areas:
Semantic Web Internet Services

With the rise of multimodal large language models (LLMs), audio codec plays an increasingly vital role in encoding audio into discrete tokens, enabling integration of audio into text-based LLMs. Current audio codec captures two types of information: acoustic and semantic. As audio codec is applied to diverse scenarios in speech language model , it needs to model increasingly complex information and adapt to varied contexts, such as scenarios with multiple speakers, background noise, or richer paralinguistic information. However, existing codec's own evaluation has been limited by simplistic metrics and scenarios, and existing benchmarks for audio codec are not designed for complex application scenarios, which limits the assessment performance on complex datasets for acoustic and semantic capabilities. We introduce CodecBench, a comprehensive evaluation dataset to assess audio codec performance from both acoustic and semantic perspectives across four data domains. Through this benchmark, we aim to identify current limitations, highlight future research directions, and foster advances in the development of audio codec. The codes are available at https://github.com/RayYuki/CodecBench.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
16 pages

Category
Electrical Engineering and Systems Science:
Audio and Speech Processing