Private kNN-VC: Interpretable Anonymization of Converted Speech
By: Carlos Franzreb , Arnab Das , Tim Polzehl and more
Potential Business Impact:
Makes voices harder to recognize while keeping speech clear.
Speaker anonymization seeks to conceal a speaker's identity while preserving the utility of their speech. The achieved privacy is commonly evaluated with a speaker recognition model trained on anonymized speech. Although this represents a strong attack, it is unclear which aspects of speech are exploited to identify the speakers. Our research sets out to unveil these aspects. It starts with kNN-VC, a powerful voice conversion model that performs poorly as an anonymization system, presumably because of prosody leakage. To test this hypothesis, we extend kNN-VC with two interpretable components that anonymize the duration and variation of phones. These components increase privacy significantly, proving that the studied prosodic factors encode speaker identity and are exploited by the privacy attack. Additionally, we show that changes in the target selection algorithm considerably influence the outcome of the privacy attack.
Similar Papers
Inference Attacks for X-Vector Speaker Anonymization
Cryptography and Security
Keeps your voice private from sneaky listeners.
GenVC: Self-Supervised Zero-Shot Voice Conversion
Audio and Speech Processing
Changes your voice to sound like anyone else.
Speaker Anonymisation for Speech-based Suicide Risk Detection
Audio and Speech Processing
Protects voices while finding people at risk.