NPU-NTU System for Voice Privacy 2024 Challenge
Jixun Yao, Nikita Kuzmin, Qing Wang, Pengcheng Guo, Ziqian Ning, Dake, Guo, Kong Aik Lee, Eng-Siong Chng, Lei Xie

TL;DR
This paper presents a novel speaker anonymization system for the VoicePrivacy Challenge 2024, employing disentangled neural codecs and multiple distillation strategies to effectively protect speaker identity while preserving speech content and emotion.
Contribution
It introduces a new disentangled neural codec architecture with serial and semantic distillation methods for improved speaker anonymization.
Findings
Achieves the best privacy and emotion preservation trade-off in VPC 2024.
Utilizes multiple distillation techniques for disentangling speaker, content, and emotion.
Employs a weighted sum of candidate speaker identities for anonymization.
Abstract
Speaker anonymization is an effective privacy protection solution that conceals the speaker's identity while preserving the linguistic content and paralinguistic information of the original speech. To establish a fair benchmark and facilitate comparison of speaker anonymization systems, the VoicePrivacy Challenge (VPC) was held in 2020 and 2022, with a new edition planned for 2024. In this paper, we describe our proposed speaker anonymization system for VPC 2024. Our system employs a disentangled neural codec architecture and a serial disentanglement strategy to gradually disentangle the global speaker identity and time-variant linguistic content and paralinguistic information. We introduce multiple distillation methods to disentangle linguistic content, speaker identity, and emotion. These methods include semantic distillation, supervised speaker distillation, and frame-level emotion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems
MethodsSparse Evolutionary Training
