The Database and Benchmark for the Source Speaker Tracing Challenge 2024
Ze Li, Yuke Lin, Tian Yao, Hongbin Suo, Pengyuan Zhang, Yanzhen Ren,, Zexin Cai, Hiromitsu Nishizaki, Ming Li

TL;DR
The paper introduces the Source Speaker Tracking Challenge 2024, providing a large-scale database and benchmarks for source speaker verification, including new tasks and baseline systems to advance research in speaker verification against voice conversion attacks.
Contribution
It presents a new large-scale database, benchmarks, and baseline systems for source speaker verification, addressing data limitations and methodological constraints in the field.
Findings
Generated a large-scale converted speech database with 16 VC methods.
Developed baseline systems based on MFA-Conformer architecture.
Introduced a conversion method recognition task.
Abstract
Voice conversion (VC) systems can transform audio to mimic another speaker's voice, thereby attacking speaker verification (SV) systems. However, ongoing studies on source speaker verification (SSV) are hindered by limited data availability and methodological constraints. This paper presents the Source Speaker Tracking Challenge (SSTC) on STL 2024, which aims to fill the gap in the database and benchmark for the SSV task. In this study, we generate a large-scale converted speech database with 16 common VC methods and train a batch of baseline systems based on the MFA-Conformer architecture. In addition, we introduced a related task called conversion method recognition, with the aim of assisting the SSV task. We expect SSTC to be a platform for advancing the development of the SSV task and provide further insights into the performance and limitations of current SV systems against VC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
