Differentially Private Speaker Anonymization
Ali Shahin Shamsabadi, Brij Mohan Lal Srivastava, Aur\'elien Bellet,, Nathalie Vauquier, Emmanuel Vincent, Mohamed Maouche, Marc Tommasi, Nicolas, Papernot

TL;DR
This paper introduces a novel speaker anonymization method that employs differentially private feature extractors to effectively remove speaker identity from speech, providing provable privacy guarantees while maintaining high utility for speech recognition tasks.
Contribution
It presents the first integration of differentially private feature extractors into a speech anonymization pipeline, offering provable privacy bounds and improved protection against adversaries.
Findings
High utility retention for speech recognition tasks
Significantly improved speaker privacy protection
Provable upper bounds on speaker information in anonymized speech
Abstract
Sharing real-world speech utterances is key to the training and deployment of voice-based services. However, it also raises privacy risks as speech contains a wealth of personal data. Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact. State-of-the-art techniques operate by disentangling the speaker information (represented via a speaker embedding) from these attributes and re-synthesizing speech based on the speaker embedding of another speaker. Prior research in the privacy community has shown that anonymization often provides brittle privacy protection, even less so any provable guarantee. In this work, we show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information. We remove speaker information from these attributes by introducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Hate Speech and Cyberbullying Detection · Speech Recognition and Synthesis
