Speech Privacy Leakage from Shared Gradients in Distributed Learning
Zhuohang Li, Jiaxin Zhang, Jian Liu

TL;DR
This paper investigates how shared gradients in distributed speech learning models can leak private information, demonstrating the potential to recover speech content and speaker identity without direct data access.
Contribution
It is the first to analyze speech privacy leakage from gradients in distributed learning, showing the feasibility of inferring speech content and speaker identity.
Findings
Shared gradients can reveal speech content and speaker identity.
Experiments show high similarity between original and recovered speech signals.
Privacy leakage occurs even without direct data access.
Abstract
Distributed machine learning paradigms, such as federated learning, have been recently adopted in many privacy-critical applications for speech analysis. However, such frameworks are vulnerable to privacy leakage attacks from shared gradients. Despite extensive efforts in the image domain, the exploration of speech privacy leakage from gradients is quite limited. In this paper, we explore methods for recovering private speech/speaker information from the shared gradients in distributed learning settings. We conduct experiments on a keyword spotting model with two different types of speech features to quantify the amount of leaked information by measuring the similarity between the original and recovered speech signals. We further demonstrate the feasibility of inferring various levels of side-channel information, including speech content and speaker identity, under the distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeophysical Methods and Applications · Speech Recognition and Synthesis
