TL;DR
This paper introduces LRPD, a large replay attack dataset for voice anti-spoofing, demonstrating its effectiveness in training neural networks with improved generalization across diverse conditions.
Contribution
The paper presents a new large-scale replay attack dataset for voice anti-spoofing, along with a baseline system and training pipeline to enhance neural network performance.
Findings
Baseline system achieves 0.28% EER on LRPD evaluation set.
Model trained on LRPD generalizes well to unknown conditions.
LRPD dataset is publicly available for research.
Abstract
The latest research in the field of voice anti-spoofing (VAS) shows that deep neural networks (DNN) outperform classic approaches like GMM in the task of presentation attack detection. However, DNNs require a lot of data to converge, and still lack generalization ability. In order to foster the progress of neural network systems, we introduce a Large Replay Parallel Dataset (LRPD) aimed for a detection of replay attacks. LRPD contains more than 1M utterances collected by 19 recording devices in 17 various environments. We also provide an example training pipeline in PyTorch [1] and a baseline system, that achieves 0.28% Equal Error Rate (EER) on evaluation subset of LRPD and 11.91% EER on publicly available ASVpoof 2017 [2] eval set. These results show that model trained with LRPD dataset has a consistent performance on the fully unknown conditions. Our dataset is free for research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
