Voice Privacy with Smart Digital Assistants in Educational Settings
Mohammad Niknazar, Aditya Vempaty, Ravi Kokku

TL;DR
This paper presents a practical framework for protecting user voice privacy in educational settings by disguising speaker identity at the source while maintaining speech intelligibility for automatic transcription.
Contribution
It introduces a novel combination of speaker identification and speech conversion techniques to enhance voice privacy directly on devices in educational environments.
Findings
The framework effectively disguises speaker identity.
Speech content remains intelligible for ASR systems.
The approach is suitable for privacy-sensitive educational contexts.
Abstract
The emergence of voice-assistant devices ushers in delightful user experiences not just on the smart home front, but also in diverse educational environments from classrooms to personalized-learning/tutoring. However, the use of voice as an interaction modality also could result in exposure of user's identity, and hinders the broader adoption of voice interfaces; this is especially important in environments where children are present and their voice privacy needs to be protected. To this end, building on state-of-the-art techniques proposed in the literature, we design and evaluate a practical and efficient framework for voice privacy at the source. The approach combines speaker identification (SID) and speech conversion methods to randomly disguise the identity of users right on the device that records the speech, while ensuring that the transformed utterances of users can still be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
