End-to-end spoofing detection with raw waveform CLDNNs

Heinrich Dinkel; Nanxin Chen; Yanmin Qian; Kai Yu

arXiv:2007.13060·eess.AS·July 28, 2020

End-to-end spoofing detection with raw waveform CLDNNs

Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Kai Yu

PDF

TL;DR

This paper introduces a novel end-to-end deep learning model based on raw waveforms for spoofing detection in speaker verification, achieving state-of-the-art accuracy without pre- or post-processing.

Contribution

The paper presents a raw waveform CLDNN model that jointly extracts features and classifies speech signals, streamlining spoofing detection.

Findings

01

Significantly reduces half total error rate to 0.82% on BTAS2016 dataset

02

Performs well under unknown spoofing conditions

03

Outperforms previous methods with a simpler end-to-end approach

Abstract

Albeit recent progress in speaker verification generates powerful models, malicious attacks in the form of spoofed speech, are generally not coped with. Recent results in ASVSpoof2015 and BTAS2016 challenges indicate that spoof-aware features are a possible solution to this problem. Most successful methods in both challenges focus on spoof-aware features, rather than focusing on a powerful classifier. In this paper we present a novel raw waveform based deep model for spoofing detection, which jointly acts as a feature extractor and classifier, thus allowing it to directly classify speech signals. This approach can be considered as an end-to-end classifier, which removes the need for any pre- or post-processing on the data, making training and evaluation a streamlined process, consuming less time than other neural-network based approaches. The experiments on the BTAS2016 dataset show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.