Score: 1

Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance

Published: May 6, 2025 | arXiv ID: 2505.03442v1

By: Diep Luong , Mikko Heikkinen , Konstantinos Drossos and more

Potential Business Impact:

Makes noisy sounds clear for small devices.

Business Areas:

Speech Recognition Data and Analytics, Software

Speech denoising is a generally adopted and impactful task, appearing in many common and everyday-life use cases. Although there are very powerful methods published, most of those are too complex for deployment in everyday and low-resources computational environments, like hand-held devices, intelligent glasses, hearing aids, etc. Knowledge distillation (KD) is a prominent way for alleviating this complexity mismatch and is based on the transferring/distilling of knowledge from a pre-trained complex model, the teacher, to another less complex one, the student. Existing KD methods for speech denoising are based on processes that potentially hamper the KD by bounding the learning of the student to the distribution, information ordering, and feature dimensionality learned by the teacher. In this paper, we present and assess a method that tries to treat this issue, by exploiting the well-known denoising-autoencoder framework, the linear inverted bottlenecks, and the properties of the cosine similarity. We use a public dataset and conduct repeated experiments with different mismatching scenarios between the teacher and the student, reporting the mean and standard deviation of the metrics of our method and another, state-of-the-art method that is used as a baseline. Our results show that with the proposed method, the student can perform better and can also retain greater mismatching conditions compared to the teacher.

Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models

Computation and Language

Makes smart computer programs smaller and faster.

18 Apr 2025 1

90%

Delta Knowledge Distillation for Large Language Models

Computation and Language

Makes small AI learn better from big AI.

18 Sep 2025 1

90%

Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods

Machine Learning (CS)

Makes AI less biased by teaching it better.

30 Oct 2025 0

View PDF Login to Bookmark

Page Count

8 pages

Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance

Makes noisy sounds clear for small devices.

Technical Abstract

Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models

Delta Knowledge Distillation for Large Language Models

Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods