Privacy-Preserving Speech Representation Learning using Vector Quantization
Pierre Champion (MULTISPEECH), Denis Jouvet (MULTISPEECH), Anthony, Larcher (LIUM)

TL;DR
This paper introduces a vector quantization method to produce privacy-preserving speech representations that hide speaker identity while maintaining speech recognition accuracy.
Contribution
It proposes a novel approach using vector quantization to balance speech utility and speaker privacy in deep speech recognition models.
Findings
Quantization reduces speaker information in representations.
Trade-off between recognition accuracy and privacy can be controlled.
Method effectively conceals speaker identity without degrading recognition performance.
Abstract
With the popularity of virtual assistants (e.g., Siri, Alexa), the use of speech recognition is now becoming more and more widespread.However, speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.The presented experiments show that the representations extracted by the deep layers of speech recognition networks contain speaker information.This paper aims to produce an anonymous representation while preserving speech recognition performance.To this end, we propose to use vector quantization to constrain the representation space and induce the network to suppress the speaker identity.The choice of the quantization dictionary size allows to configure the trade-off between utility (speech recognition) and privacy (speaker identity concealment).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Geophysical Methods and Applications · Speech and Audio Processing
