A Method to Reveal Speaker Identity in Distributed ASR Training, and How   to Counter It

Trung Dang; Om Thakkar; Swaroop Ramaswamy; Rajiv Mathews; Peter Chin,; Fran\c{c}oise Beaufays

arXiv:2104.07815·cs.CL·April 19, 2021·1 cites

A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter It

Trung Dang, Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Peter Chin,, Fran\c{c}oise Beaufays

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hessian-Free Gradients Matching, a novel method to identify speakers from gradients in distributed ASR training, revealing privacy risks and evaluating countermeasures like Dropout and Differential Privacy.

Contribution

We propose the first input reconstruction technique for speaker identification from gradients without second derivatives, demonstrating its effectiveness and privacy implications in federated ASR training.

Findings

01

Achieved 34% top-1 speaker identification accuracy on LibriSpeech.

02

Dropout at 0.2 reduces accuracy to near zero.

03

Method exposes privacy vulnerabilities in distributed ASR models.

Abstract

End-to-end Automatic Speech Recognition (ASR) models are commonly trained over spoken utterances using optimization methods like Stochastic Gradient Descent (SGD). In distributed settings like Federated Learning, model training requires transmission of gradients over a network. In this work, we design the first method for revealing the identity of the speaker of a training utterance with access only to a gradient. We propose Hessian-Free Gradients Matching, an input reconstruction technique that operates without second derivatives of the loss function (required in prior works), which can be expensive to compute. We show the effectiveness of our method using the DeepSpeech model architecture, demonstrating that it is possible to reveal the speaker's identity with 34% top-1 accuracy (51% top-5 accuracy) on the LibriSpeech dataset. Further, we study the effect of two well-known techniques,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

googleinterns/deepspeech-reconstruction
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Geophysical Methods and Applications · Topic Modeling

MethodsStochastic Gradient Descent · Dropout