SynVox2: Towards a privacy-friendly VoxCeleb2 dataset
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Nicholas, Evans, Massimiliano Todisco, Jean-Fran\c{c}ois Bonastre, Mickael Rouvier

TL;DR
This paper introduces SynVox2, a synthetic version of the VoxCeleb2 dataset, designed to address privacy and ethical concerns while maintaining utility for speaker recognition tasks.
Contribution
The paper presents a method to generate a privacy-preserving synthetic VoxCeleb2 dataset, enabling ethical and legal use in speaker recognition research.
Findings
Synthetic data maintains comparable performance in speaker verification
Addresses privacy and legal issues associated with real datasets
Highlights challenges in using synthetic data for downstream tasks
Abstract
The success of deep learning in speaker recognition relies heavily on the use of large datasets. However, the data-hungry nature of deep learning methods has already being questioned on account the ethical, privacy, and legal concerns that arise when using large-scale datasets of natural speech collected from real human speakers. For example, the widely-used VoxCeleb2 dataset for speaker recognition is no longer accessible from the official website. To mitigate these concerns, this work presents an initiative to generate a privacy-friendly synthetic VoxCeleb2 dataset that ensures the quality of the generated speech in terms of privacy, utility, and fairness. We also discuss the challenges of using synthetic data for the downstream task of speaker verification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
