Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi, Hamed Haddadi, David Boyle

TL;DR
This paper introduces a privacy-aware speech data sharing framework using disentangled representations, effectively preventing attribute inference attacks while maintaining high accuracy in primary speech tasks.
Contribution
We propose a novel, user-configurable framework that removes sensitive attributes from speech data via disentangled representation learning, enhancing privacy without sacrificing task performance.
Findings
Reduces attribute inference attack success rates to near random guessing.
Maintains over 99% accuracy in speech recognition and user identification.
Effective across five datasets, demonstrating robustness and generality.
Abstract
Voice User Interfaces (VUIs) are increasingly popular and built into smartphones, home assistants, and Internet of Things (IoT) devices. Despite offering an always-on convenient user experience, VUIs raise new security and privacy concerns for their users. In this paper, we focus on attribute inference attacks in the speech domain, demonstrating the potential for an attacker to accurately infer a target user's sensitive and private attributes (e.g. their emotion, sex, or health status) from deep acoustic models. To defend against this class of attacks, we design, implement, and evaluate a user-configurable, privacy-aware framework for optimizing speech-related data sharing mechanisms. Our objective is to enable primary tasks such as speech recognition and user identification, while removing sensitive attributes in the raw speech data before sharing it with a cloud service provider. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
