Score: 1

PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning

Published: November 27, 2025 | arXiv ID: 2511.22687v1

By: Jiatong Shi , Haoran Wang , William Chen and more

Potential Business Impact:

Makes phone calls sound clearer, even with background noise.

Business Areas:
Quantum Computing Science and Engineering

Neural speech codecs have achieved strong performance in low-bitrate compression, but residual vector quantization (RVQ) often suffers from unstable training and ineffective decomposition, limiting reconstruction quality and efficiency. We propose PURE Codec (Progressive Unfolding of Residual Entropy), a novel framework that guides multi-stage quantization using a pre-trained speech enhancement model. The first quantization stage reconstructs low-entropy, denoised speech embeddings, while subsequent stages encode residual high-entropy components. This design improves training stability significantly. Experiments demonstrate that PURE consistently outperforms conventional RVQ-based codecs in reconstruction and downstream speech language model-based text-to-speech, particularly under noisy training conditions.

Repos / Data Links

Page Count
8 pages

Category
Computer Science:
Sound