SegReConcat: A Data Augmentation Method for Voice Anonymization Attack
Ridwan Arefeen, Xiaoxiao Miao, Rong Tong, Aik Beng Ng, Simon See

TL;DR
SegReConcat is a novel data augmentation technique that enhances attacker capabilities in voice anonymization by rearranging speech segments, revealing residual speaker cues and increasing de-anonymization success.
Contribution
It introduces a new segmentation and concatenation method to improve attacker-side de-anonymization of anonymized speech data.
Findings
Improves de-anonymization on 5 out of 7 systems in evaluations.
Effectively disrupts long-term contextual cues in anonymized speech.
Demonstrates the vulnerability of current anonymization methods.
Abstract
Anonymization of voice seeks to conceal the identity of the speaker while maintaining the utility of speech data. However, residual speaker cues often persist, which pose privacy risks. We propose SegReConcat, a data augmentation method for attacker-side enhancement of automatic speaker verification systems. SegReConcat segments anonymized speech at the word level, rearranges segments using random or similarity-based strategies to disrupt long-term contextual cues, and concatenates them with the original utterance, allowing an attacker to learn source speaker traits from multiple perspectives. The proposed method has been evaluated in the VoicePrivacy Attacker Challenge 2024 framework across seven anonymization systems, SegReConcat improves de-anonymization on five out of seven systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
