Environmental Sound Deepfake Detection Challenge: An Overview

Han Yin; Yang Xiao; Rohan Kumar Das; Jisheng Bai; Ting Dang

arXiv:2512.24140·cs.SD·January 1, 2026

Environmental Sound Deepfake Detection Challenge: An Overview

Han Yin, Yang Xiao, Rohan Kumar Das, Jisheng Bai, Ting Dang

PDF

Open Access

TL;DR

This paper introduces EnvSDD, a large-scale dataset for environmental sound deepfake detection, and discusses the results of the associated challenge to advance detection methods against realistic audio forgeries.

Contribution

The paper presents the first large-scale, diverse dataset EnvSDD for environmental sound deepfake detection and launches the ESDD Challenge as a benchmark for future research.

Findings

01

Challenge results demonstrate improved detection accuracy

02

Diverse sound categories enhance model robustness

03

Baseline methods show potential for real-world application

Abstract

Recent progress in audio generation models has made it possible to create highly realistic and immersive soundscapes, which are now widely used in film and virtual-reality-related applications. However, these audio generators also raise concerns about potential misuse, such as producing deceptive audio for fabricated videos or spreading misleading information. Therefore, it is essential to develop effective methods for detecting fake environmental sounds. Existing datasets for environmental sound deepfake detection (ESDD) remain limited in both scale and the diversity of sound categories they cover. To address this gap, we introduced EnvSDD, the first large-scale curated dataset designed for ESDD. Based on EnvSDD, we launched the ESDD Challenge, recognized as one of the ICASSP 2026 Grand Challenges. This paper presents an overview of the ESDD Challenge, including a detailed analysis of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Generative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection