Score: 1

Unsupervised Speech Enhancement using Data-defined Priors

Published: September 26, 2025 | arXiv ID: 2509.22942v1

By: Dominik Klement , Matthew Maciejewski , Sanjeev Khudanpur and more

Potential Business Impact:

Cleans up noisy voices without needing perfect examples.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The majority of deep learning-based speech enhancement methods require paired clean-noisy speech data. Collecting such data at scale in real-world conditions is infeasible, which has led the community to rely on synthetically generated noisy speech. However, this introduces a gap between the training and testing phases. In this work, we propose a novel dual-branch encoder-decoder architecture for unsupervised speech enhancement that separates the input into clean speech and residual noise. Adversarial training is employed to impose priors on each branch, defined by unpaired datasets of clean speech and, optionally, noise. Experimental results show that our method achieves performance comparable to leading unsupervised speech enhancement approaches. Furthermore, we demonstrate the critical impact of clean speech data selection on enhancement performance. In particular, our findings reveal that performance may appear overly optimistic when in-domain clean speech data are used for prior definition -- a practice adopted in previous unsupervised speech enhancement studies.

Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior

Audio and Speech Processing

Cleans up noisy audio to hear voices better.

17 Sep 2025 0

89%

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Audio and Speech Processing

Separates music into individual instruments without needing original recordings.

8 Dec 2025 0

88%

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders

Audio and Speech Processing

Cleans up messy sounds to make voices clear.

13 Jun 2025 3

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

5 pages

Unsupervised Speech Enhancement using Data-defined Priors

Cleans up noisy voices without needing perfect examples.

Technical Abstract

Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders