VoiceMask: Anonymize and Sanitize Voice Input on Mobile Devices

Jianwei Qian; Haohua Du; Jiahui Hou; Linlin Chen; Taeho Jung,; Xiang-Yang Li; Yu Wang; Yanbo Deng

arXiv:1711.11460·cs.CR·December 1, 2017·25 cites

VoiceMask: Anonymize and Sanitize Voice Input on Mobile Devices

Jianwei Qian, Haohua Du, Jiahui Hou, Linlin Chen, Taeho Jung,, Xiang-Yang Li, Yu Wang, Yanbo Deng

PDF

Open Access

TL;DR

VoiceMask is a privacy-preserving system for mobile voice input that sanitizes voice data on-device before cloud-based speech recognition, significantly reducing user identification risks while maintaining high recognition accuracy.

Contribution

We introduce VoiceMask, a novel on-device voice sanitization framework combining voice conversion and keyword substitution to protect user privacy in cloud speech recognition.

Findings

01

Reduces voice identification risk by 84%

02

Maintains speech recognition accuracy within 14.2%

03

Efficient implementation on Android devices

Abstract

Voice input has been tremendously improving the user experience of mobile devices by freeing our hands from typing on the small screen. Speech recognition is the key technology that powers voice input, and it is usually outsourced to the cloud for the best performance. However, the cloud might compromise users' privacy by identifying their identities by voice, learning their sensitive input content via speech recognition, and then profiling the mobile users based on the content. In this paper, we design an intermediate between users and the cloud, named VoiceMask, to sanitize users' voice data before sending it to the cloud for speech recognition. We analyze the potential privacy risks and aim to protect users' identities and sensitive input content from being disclosed to the cloud. VoiceMask adopts a carefully designed voice conversion mechanism that is resistant to several attacks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing