Improved DeepFake Detection Using Whisper Features

Piotr Kawa; Marcin Plata; Micha{\l} Czuba; Piotr Szyma\'nski; Piotr; Syga

arXiv:2306.01428·cs.SD·June 5, 2023·2 cites

Improved DeepFake Detection Using Whisper Features

Piotr Kawa, Marcin Plata, Micha{\l} Czuba, Piotr Szyma\'nski, Piotr, Syga

PDF

Open Access 1 Repo

TL;DR

This paper explores using Whisper speech recognition features as a front-end for DeepFake audio detection, demonstrating improved accuracy and reduced error rates across multiple detection models and datasets.

Contribution

It introduces the use of Whisper features as a novel front-end for DeepFake detection, outperforming existing methods on in-the-wild datasets.

Findings

01

Whisper features improve detection accuracy for all tested models.

02

Using Whisper reduces Equal Error Rate by 21% on the In-The-Wild dataset.

03

Whisper-based front-ends outperform traditional features in DeepFake audio detection.

Abstract

With a recent influx of voice generation methods, the threat introduced by audio DeepFake (DF) is ever-increasing. Several different detection methods have been presented as a countermeasure. Many methods are based on so-called front-ends, which, by transforming the raw audio, emphasize features crucial for assessing the genuineness of the audio sample. Our contribution contains investigating the influence of the state-of-the-art Whisper automatic speech recognition model as a DF detection front-end. We compare various combinations of Whisper and well-established front-ends by training 3 detection models (LCNN, SpecRNet, and MesoNet) on a widely used ASVspoof 2021 DF dataset and later evaluating them on the DF In-The-Wild dataset. We show that using Whisper-based features improves the detection for each model and outperforms recent results on the In-The-Wild dataset by reducing Equal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

piotrkawa/deepfake-whisper-features
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing