Source Tracing of Audio Deepfake Systems

Nicholas Klein; Tianxiang Chen; Hemlata Tak; Ricardo Casal; Elie; Khoury

arXiv:2407.08016·eess.AS·September 26, 2024·1 cites

Source Tracing of Audio Deepfake Systems

Nicholas Klein, Tianxiang Chen, Hemlata Tak, Ricardo Casal, Elie, Khoury

PDF

Open Access

TL;DR

This paper presents a system that classifies specific techniques used in audio deepfake creation, helping to identify the distinct stages and methods involved in generating realistic fake audio samples.

Contribution

It introduces a novel classification system that detects various spoofing attributes across the entire audio deepfake generation pipeline, enhancing anti-spoofing capabilities.

Findings

01

System accurately classifies spoofing attributes on multiple datasets

02

Demonstrates robustness against different deepfake generation techniques

03

Improves understanding of deepfake generation stages

Abstract

Recent progress in generative AI technology has made audio deepfakes remarkably more realistic. While current research on anti-spoofing systems primarily focuses on assessing whether a given audio sample is fake or genuine, there has been limited attention on discerning the specific techniques to create the audio deepfakes. Algorithms commonly used in audio deepfake generation, like text-to-speech (TTS) and voice conversion (VC), undergo distinct stages including input processing, acoustic modeling, and waveform generation. In this work, we introduce a system designed to classify various spoofing attributes, capturing the distinctive features of individual modules throughout the entire generation pipeline. We evaluate our system on two datasets: the ASVspoof 2019 Logical Access and the Multi-Language Audio Anti-Spoofing Dataset (MLAAD). Results from both experiments demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Digital Media Forensic Detection · Speech Recognition and Synthesis

MethodsSoftmax · Attention Is All You Need