ADD 2022: the First Audio Deep Synthesis Detection Challenge

Jiangyan Yi; Ruibo Fu; Jianhua Tao; Shuai Nie; Haoxin Ma; Chenglong; Wang; Tao Wang; Zhengkun Tian; Xiaohui Zhang; Ye Bai; Cunhang Fan; Shan; Liang; Shiming Wang; Shuai Zhang; Xinrui Yan; Le Xu; Zhengqi Wen; Haizhou Li,; Zheng Lian; Bin Liu

arXiv:2202.08433·cs.SD·July 3, 2024·1 cites

ADD 2022: the First Audio Deep Synthesis Detection Challenge

Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong, Wang, Tao Wang, Zhengkun Tian, Xiaohui Zhang, Ye Bai, Cunhang Fan, Shan, Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li,, Zheng Lian, Bin Liu

PDF

Open Access

TL;DR

The paper introduces the first Audio Deep Synthesis Detection Challenge (ADD 2022), addressing real-world scenarios of audio deepfake detection through three distinct tracks and providing insights into recent progress in the field.

Contribution

It presents a comprehensive challenge with datasets, evaluation protocols, and benchmarks for detecting various types of audio deepfakes, filling a gap in real-world scenario testing.

Findings

01

Advances in detection accuracy across tracks

02

Effectiveness of different detection methods

03

Insights into challenges of real-world audio deepfake detection

Abstract

Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021. However, the recent shared tasks have not covered many real-life and challenging scenarios. The first Audio Deep synthesis Detection challenge (ADD) was motivated to fill in the gap. The ADD 2022 includes three tracks: low-quality fake audio detection (LF), partially fake audio detection (PF) and audio fake game (FG). The LF track focuses on dealing with bona fide and fully fake utterances with various real-world noises etc. The PF track aims to distinguish the partially fake audio from the real. The FG track is a rivalry game, which includes two tasks: an audio generation task and an audio fake detection task. In this paper, we describe the datasets, evaluation metrics, and protocols. We also report major findings that reflect the recent advances in audio deepfake detection tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Digital Media Forensic Detection · Speech and Audio Processing