Configurable Privacy-Preserving Automatic Speech Recognition
Ranya Aloufi, Hamed Haddadi, David Boyle

TL;DR
This paper explores modular, configurable privacy-preserving automatic speech recognition by combining separation, recognition, and discretization modules, demonstrating how privacy can be enhanced while maintaining recognition performance.
Contribution
It introduces a modular ASR framework that allows configurable privacy levels, integrating state-of-the-art techniques to mitigate privacy risks at each stage.
Findings
Speech separation reduces privacy risks from overlapping speech.
Discretization minimizes paralinguistic information leakage.
Privacy levels can be tuned without significantly affecting recognition accuracy.
Abstract
Voice assistive technologies have given rise to far-reaching privacy and security concerns. In this paper we investigate whether modular automatic speech recognition (ASR) can improve privacy in voice assistive systems by combining independently trained separation, recognition, and discretization modules to design configurable privacy-preserving ASR systems. We evaluate privacy concerns and the effects of applying various state-of-the-art techniques at each stage of the system, and report results using task-specific metrics (i.e. WER, ABX, and accuracy). We show that overlapping speech inputs to ASR systems present further privacy concerns, and how these may be mitigated using speech separation and optimization techniques. Our discretization module is shown to minimize paralinguistics privacy leakage from ASR acoustic models to levels commensurate with random guessing. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
