Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for   Replay Attack Detection

Yongqiang Dou; Haocheng Yang; Maolin Yang; Yanyan Xu; Dengfeng Ke

arXiv:2006.14563·cs.CV·January 19, 2023·1 cites

Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for Replay Attack Detection

Yongqiang Dou, Haocheng Yang, Maolin Yang, Yanyan Xu, Dengfeng Ke

PDF

Open Access 1 Repo

TL;DR

This paper introduces D3M, a novel training approach using balanced focal loss to improve anti-spoofing in speaker verification, especially for indistinguishable samples, achieving state-of-the-art results on the ASVspoof2019 dataset.

Contribution

It proposes a balanced focal loss function for training anti-spoofing models, addressing data discrepancy issues and enhancing detection of challenging samples.

Findings

01

Balanced focal loss outperforms cross-entropy loss in anti-spoofing tasks.

02

Fusion of three feature types surpasses more complex models in performance.

03

Method maintains effectiveness on real replay data, indicating robustness.

Abstract

It becomes urgent to design effective anti-spoofing algorithms for vulnerable automatic speaker verification systems due to the advancement of high-quality playback devices. Current studies mainly treat anti-spoofing as a binary classification problem between bonafide and spoofed utterances, while lack of indistinguishable samples makes it difficult to train a robust spoofing detector. In this paper, we argue that for anti-spoofing, it needs more attention for indistinguishable samples over easily-classified ones in the modeling process, to make correct discrimination a top priority. Therefore, to mitigate the data discrepancy between training and inference, we propose D3M, to leverage a balanced focal loss function as the training objective to dynamically scale the loss based on the traits of the sample itself. Besides, in the experiments, we select three kinds of features that contain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

asvspoof/D3M
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

MethodsFocal Loss