A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
Xuechen Liu, Xin Wang, Junichi Yamagishi

TL;DR
This paper investigates the performance of audio spoofing detection systems in realistic, complex scenarios involving long-duration, multi-speaker audio with acoustic variations, highlighting current limitations and potential improvements.
Contribution
It provides the first analysis of spoofing detection in in-the-wild conditions with complex audio, revealing key issues and suggesting preliminary enhancements.
Findings
Current methods struggle with long, multi-speaker audio
Variations in duration and acoustics significantly affect detection performance
Identifies key challenges for real-world audio spoofing detection
Abstract
Audio spoofing detection has become increasingly important due to the rise in real-world cases. Current spoofing detectors, referred to as spoofing countermeasures (CM), are mainly trained and focused on audio waveforms with a single speaker and short duration. This study explores spoofing detection in more realistic scenarios, where the audio is long in duration and features multiple speakers and complex acoustic conditions. We test the widely-acquired AASIST under this challenging scenario, looking at the impact of multiple variations such as duration, speaker presence, and acoustic complexities on CM performance. Our work reveals key issues with current methods and suggests preliminary ways to improve them. We aim to make spoofing detection more applicable in more in-the-wild scenarios. This research is served as an important step towards developing detection systems that can handle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Digital Media Forensic Detection
