Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization
Liping Chen, Chenyang Guo, Rui Wang, Kong Aik Lee, Zhenhua Ling

TL;DR
This paper introduces an any-to-any training strategy for voice anonymization that enhances privacy by anonymizing utterances to a pseudo-speaker, reducing privacy risks associated with targeted attacks.
Contribution
It proposes a novel any-to-any training method with a batch mean loss and a speaker-adversarial model, improving voice anonymization privacy and robustness against attacks.
Findings
Effective anonymization on VoxCeleb dataset
Reduced privacy risk compared to targeted strategies
Insights into model limitations and future directions
Abstract
Speaker attribute perturbation offers a feasible approach to asynchronous voice anonymization by employing adversarially perturbed speech as anonymized output. In order to enhance the identity unlinkability among anonymized utterances from the same original speaker, the targeted attack training strategy is usually applied to anonymize the utterances to a common designated speaker. However, this strategy may violate the privacy of the designated speaker who is an actual speaker. To mitigate this risk, this paper proposes an any-to-any training strategy. It is accomplished by defining a batch mean loss to anonymize the utterances from various speakers within a training mini-batch to a common pseudo-speaker, which is approximated as the average speaker in the mini-batch. Based on this, a speaker-adversarial speech generation model is proposed, incorporating the supervision from both the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
