SecureSpeech: Prompt-based Speaker and Content Protection
By: Belinda Soh Hui Hui, Xiaoxiao Miao, Xin Wang
Potential Business Impact:
Makes voices private, hiding who spoke and what was said.
Given the increasing privacy concerns from identity theft and the re-identification of speakers through content in the speech field, this paper proposes a prompt-based speech generation pipeline that ensures dual anonymization of both speaker identity and spoken content. This is addressed through 1) generating a speaker identity unlinkable to the source speaker, controlled by descriptors, and 2) replacing sensitive content within the original text using a name entity recognition model and a large language model. The pipeline utilizes the anonymized speaker identity and text to generate high-fidelity, privacy-friendly speech via a text-to-speech synthesis model. Experimental results demonstrate an achievement of significant privacy protection while maintaining a decent level of content retention and audio quality. This paper also investigates the impact of varying speaker descriptions on the utility and privacy of generated speech to determine potential biases.
Similar Papers
Speaker Anonymisation for Speech-based Suicide Risk Detection
Audio and Speech Processing
Protects voices while finding people at risk.
SafeSpeech: Robust and Universal Voice Protection Against Malicious Speech Synthesis
Sound
Stops fake voices from being made from your speech.
FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks
Cryptography and Security
Protects your voice from being copied by others.