Replay Spoofing Countermeasure Using Autoencoder and Siamese Network on   ASVspoof 2019 Challenge

Mohammad Adiban; Hossein Sameti; Saeedreza Shehnepoor

arXiv:1910.13345·eess.AS·October 30, 2019·1 cites

Replay Spoofing Countermeasure Using Autoencoder and Siamese Network on ASVspoof 2019 Challenge

Mohammad Adiban, Hossein Sameti, Saeedreza Shehnepoor

PDF

Open Access

TL;DR

This paper proposes a novel replay spoofing countermeasure for automatic speaker verification using CQCC features, autoencoder, and Siamese network, achieving significant improvements on the ASVspoof 2019 dataset.

Contribution

Introduces a new replay spoofing detection method combining CQCC features, autoencoder, and Siamese network, with first-time application of Siamese networks in this context.

Findings

01

Achieved 10.73% reduction in EER over baseline.

02

Improved t-DCF by 0.2344 compared to baseline.

03

Effective discrimination of replay spoofing attacks.

Abstract

Automatic Speaker Verification (ASV) is the process of identifying a person based on the voice presented to a system. Different synthetic approaches allow spoofing to deceive ASV systems (ASVs), whether using techniques to imitate a voice or recunstruct the features. Attackers try to beat up the ASVs using four general techniques; impersonation, speech synthesis, voice conversion, and replay. The last technique is considered as a common and high potential tool for spoofing purposes since replay attacks are more accessible and require no technical knowledge from adversaries. In this study, we introduce a novel replay spoofing countermeasure for ASVs. Accordingly, we used the Constant Q Cepstral Coefficient (CQCC) features fed into an autoencoder to attain more informative features and to consider the noise information of spoofed utterances for discrimination purpose. Finally, different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing