Synthetic Voice Spoofing Detection Based On Online Hard Example Mining
Chenlei Hu, Ruohua Zhou

TL;DR
This paper introduces an Online Hard Example Mining algorithm to improve detection of unknown voice spoofing attacks, achieving a low error rate and addressing dataset imbalance issues.
Contribution
It presents a novel OHEM-based system for voice spoofing detection that enhances recognition of hard-to-detect attacks in speaker verification.
Findings
Achieved 0.77% EER on ASVspoof 2019 logical access dataset
Effectively handles dataset imbalance between simple and hard samples
Improves detection of unknown voice spoofing attacks
Abstract
The automatic speaker verification spoofing (ASVspoof) challenge series is crucial for enhancing the spoofing consideration and the countermeasures growth. Although the recent ASVspoof 2019 validation results indicate the significant capability to identify most attacks, the model's recognition effect is still poor for some attacks. This paper presents the Online Hard Example Mining (OHEM) algorithm for detecting unknown voice spoofing attacks. The OHEM is utilized to overcome the imbalance between simple and hard samples in the dataset. The presented system provides an equal error rate (EER) of 0.77% on the ASVspoof 2019 Challenge logical access scenario's evaluation set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
