Privacy-Preserving Speech Representation Learning using Vector   Quantization

Pierre Champion (MULTISPEECH); Denis Jouvet (MULTISPEECH); Anthony; Larcher (LIUM)

arXiv:2203.09518·eess.AS·March 21, 2022

Privacy-Preserving Speech Representation Learning using Vector Quantization

Pierre Champion (MULTISPEECH), Denis Jouvet (MULTISPEECH), Anthony, Larcher (LIUM)

PDF

Open Access

TL;DR

This paper introduces a vector quantization method to produce privacy-preserving speech representations that hide speaker identity while maintaining speech recognition accuracy.

Contribution

It proposes a novel approach using vector quantization to balance speech utility and speaker privacy in deep speech recognition models.

Findings

01

Quantization reduces speaker information in representations.

02

Trade-off between recognition accuracy and privacy can be controlled.

03

Method effectively conceals speaker identity without degrading recognition performance.

Abstract

With the popularity of virtual assistants (e.g., Siri, Alexa), the use of speech recognition is now becoming more and more widespread.However, speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.The presented experiments show that the representations extracted by the deep layers of speech recognition networks contain speaker information.This paper aims to produce an anonymous representation while preserving speech recognition performance.To this end, we propose to use vector quantization to constrain the representation space and induce the network to suppress the speaker identity.The choice of the quantization dictionary size allows to configure the trade-off between utility (speech recognition) and privacy (speaker identity concealment).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Geophysical Methods and Applications · Speech and Audio Processing