SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio   Detection

Jiangyan Yi; Chenglong Wang; Jianhua Tao; Chu Yuan Zhang and; Cunhang Fan; Zhengkun Tian; Haoxin Ma; Ruibo Fu

arXiv:2211.06073·cs.SD·April 5, 2024

SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection

Jiangyan Yi, Chenglong Wang, Jianhua Tao, Chu Yuan Zhang and, Cunhang Fan, Zhengkun Tian, Haoxin Ma, Ruibo Fu

PDF

Open Access 2 Repos

TL;DR

This paper introduces SceneFake, a new dataset for detecting fake audio where the acoustic scene is manipulated, revealing the limitations of current models and emphasizing the need for specialized detection methods.

Contribution

The paper presents SceneFake, the first dataset focused on scene fake audio detection, along with benchmark results and analysis of attack methods using speech enhancement technologies.

Findings

01

Baseline models perform poorly on unseen test data.

02

Models trained on ASVspoof 2019 do not generalize well to scene fake audio.

03

SceneFake dataset exposes gaps in current fake audio detection capabilities.

Abstract

Many datasets have been designed to further the development of fake audio detection. However, fake utterances in previous datasets are mostly generated by altering timbre, prosody, linguistic content or channel noise of original audio. These datasets leave out a scenario, in which the acoustic scene of an original audio is manipulated with a forged one. It will pose a major threat to our society if some people misuse the manipulated audio with malicious purpose. Therefore, this motivates us to fill in the gap. This paper proposes such a dataset for scene fake audio detection named SceneFake, where a manipulated audio is generated by only tampering with the acoustic scene of an real utterance by using speech enhancement technologies. Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper. In addition, an analysis of fake attacks with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Digital Media Forensic Detection · Music Technology and Sound Studies

MethodsTest