VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning
Qianyue Hu, Junyan Wu, Wei Lu, Xiangyang Luo

TL;DR
VoiceCloak is a comprehensive defense framework that employs adversarial perturbations and disruption of diffusion model processes to prevent unauthorized voice cloning, addressing the unique challenges posed by diffusion-based generative mechanisms.
Contribution
It introduces a novel multi-dimensional proactive defense specifically designed for diffusion models in voice cloning, disrupting both speaker identity and output quality.
Findings
High defense success rate against diffusion-based voice cloning
Effective obfuscation of speaker identity and degradation of speech quality
Demonstrated robustness across various experimental scenarios
Abstract
Diffusion Models (DMs) have achieved remarkable success in realistic voice cloning (VC), while they also increase the risk of malicious misuse. Existing proactive defenses designed for traditional VC models aim to disrupt the forgery process, but they have been proven incompatible with DMs due to the intricate generative mechanisms of diffusion. To bridge this gap, we introduce VoiceCloak, a multi-dimensional proactive defense framework with the goal of obfuscating speaker identity and degrading perceptual quality in potential unauthorized VC. To achieve these goals, we conduct a focused analysis to identify specific vulnerabilities within DMs, allowing VoiceCloak to disrupt the cloning process by introducing adversarial perturbations into the reference audio. Specifically, to obfuscate speaker identity, VoiceCloak first targets speaker identity by distorting representation learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsSoftmax · Attention Is All You Need
