Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release
Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa

TL;DR
This paper introduces a new privacy metric called voice-indistinguishability for protecting speaker identity in speech data, along with mechanisms to ensure privacy while maintaining data utility, verified through experiments.
Contribution
It proposes a formal privacy definition for voiceprint protection and develops mechanisms for privacy-preserving speech data release based on this metric.
Findings
The proposed methods effectively protect voiceprint privacy.
Experiments show the mechanisms are efficient and maintain data utility.
Voice-indistinguishability extends differential privacy to speech data.
Abstract
With the development of smart devices, such as the Amazon Echo and Apple's HomePod, speech data have become a new dimension of big data. However, privacy and security concerns may hinder the collection and sharing of real-world speech data, which contain the speaker's identifiable information, i.e., voiceprint, which is considered a type of biometric identifier. Current studies on voiceprint privacy protection do not provide either a meaningful privacy-utility trade-off or a formal and rigorous definition of privacy. In this study, we design a novel and rigorous privacy metric for voiceprint privacy, which is referred to as voice-indistinguishability, by extending differential privacy. We also propose mechanisms and frameworks for privacy-preserving speech data release satisfying voice-indistinguishability. Experiments on public datasets verify the effectiveness and efficiency of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Privacy-Preserving Technologies in Data
