Retrieving Speaker Information from Personalized Acoustic Models for Speech Recognition
Salima Mdhaffar, Jean-Fran\c{c}ois Bonastre, Marc Tommasi, Natalia, Tomashenko, Yannick Est\`eve

TL;DR
This paper demonstrates that personalized acoustic models in speech recognition can leak personal information, such as gender and identity, through weight matrix analysis, raising privacy concerns.
Contribution
It introduces a method to infer speaker gender and identity solely from neural network weights of personalized models, highlighting privacy risks.
Findings
Gender can be identified with 95% accuracy using early layers.
Speaker verification achieves an EER of 9.07% using weight analysis.
Personalized models can leak sensitive speaker information without access to raw data.
Abstract
The widespread of powerful personal devices capable of collecting voice of their users has opened the opportunity to build speaker adapted speech recognition system (ASR) or to participate to collaborative learning of ASR. In both cases, personalized acoustic models (AM), i.e. fine-tuned AM with specific speaker data, can be built. A question that naturally arises is whether the dissemination of personalized acoustic models can leak personal information. In this paper, we show that it is possible to retrieve the gender of the speaker, but also his identity, by just exploiting the weight matrix changes of a neural acoustic model locally adapted to this speaker. Incidentally we observe phenomena that may be useful towards explainability of deep neural networks in the context of speech processing. Gender can be identified almost surely using only the first layers and speaker verification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Model
