A Study On Data Augmentation In Voice Anti-Spoofing

Ariel Cohen; Inbal Rimon; Eran Aflalo; and Haim Permuter

arXiv:2110.10491·cs.SD·October 22, 2021

A Study On Data Augmentation In Voice Anti-Spoofing

Ariel Cohen, Inbal Rimon, Eran Aflalo, and Haim Permuter

PDF

Open Access 1 Repo

TL;DR

This paper investigates data augmentation techniques to enhance the detection of synthetic or spoofed audio, introducing novel methods like SpecAverage and a new spectrogram feature design, achieving state-of-the-art results in anti-spoofing challenges.

Contribution

It presents new data augmentation strategies, including compression, channel augmentation, and SpecAverage, along with a novel spectrogram feature design, significantly improving anti-spoofing system performance.

Findings

01

State-of-the-art EER of 15.46% in Deep Fake detection

02

50% reduction in baseline EER for Logical Access

03

Improved generalization through SpecAverage augmentation

Abstract

In this paper, we perform an in-depth study of how data augmentation techniques improve synthetic or spoofed audio detection. Specifically, we propose methods to deal with channel variability, different audio compressions, different band-widths, and unseen spoofing attacks, which have all been shown to significantly degrade the performance of audio-based systems and Anti-Spoofing systems. Our results are based on the ASVspoof 2021 challenge, in the Logical Access (LA) and Deep Fake (DF) categories. Our study is Data-Centric, meaning that the models are fixed and we significantly improve the results by making changes in the data. We introduce two forms of data augmentation - compression augmentation for the DF part, compression & channel augmentation for the LA part. In addition, a new type of online data augmentation, SpecAverage, is introduced in which the audio features are masked…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

InbalRim/A-Study-On-Data-Augmentation-In-Voice-Anti-Spoofing
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders