Anonymization, Not Elimination: Utility-Preserved Speech Anonymization
Yunchong Xiao, Yuxiang Zhao, Ziyang Ma, Shuai Wang, Kai Yu, Jiachun Liao, and Xie Chen

TL;DR
This paper introduces a two-stage speech anonymization framework that enhances privacy while preserving data utility for various speech processing tasks, with comprehensive evaluation methods.
Contribution
A novel two-stage framework combining content editing and flow-based speaker anonymization, improving privacy and utility over existing methods.
Findings
Achieves stronger privacy protection with minimal utility loss.
Effective for ASR, TTS, and SER tasks when trained from scratch.
Provides a comprehensive evaluation protocol for speech anonymization.
Abstract
The growing reliance on large-scale speech data has made privacy protection a critical concern. However, existing anonymization approaches often degrade data utility, for example by disrupting acoustic continuity or reducing vocal diversity, which compromises the value of speech data for downstream tasks such as Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Speech Emotion Recognition (SER). Current evaluation practices are also limited, as they mainly rely on direct testing of anonymized speech with pretrained models, providing only a partial view of utility. To address these issues, we propose a novel two-stage framework that protects both linguistic content and acoustic identity while maintaining usability. For content privacy, we employ a generative speech editing model to seamlessly replace personally identifiable information (PII), and for voice privacy, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
