Are Deep Speech Denoising Models Robust to Adversarial Noise?
By: Will Schwarzer , Philip S. Thomas , Andrea Fanelli and more
Potential Business Impact:
Makes voice helpers hear wrong words on purpose.
Deep noise suppression (DNS) models enjoy widespread use throughout a variety of high-stakes speech applications. However, in this paper, we show that four recent DNS models can each be reduced to outputting unintelligible gibberish through the addition of imperceptible adversarial noise. Furthermore, our results show the near-term plausibility of targeted attacks, which could induce models to output arbitrary utterances, and over-the-air attacks. While the success of these attacks varies by model and setting, and attacks appear to be strongest when model-specific (i.e., white-box and non-transferable), our results highlight a pressing need for practical countermeasures in DNS systems.
Similar Papers
Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
Audio and Speech Processing
Makes voices say different things with hidden sounds.
Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression
Sound
Makes voice assistants safer from sneaky tricks.
A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments
Sound
Cleans up noisy voices for clearer talking.