Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing
Hye-jin Shim, Md Sahidullah, Jee-weon Jung, Shinji Watanabe, Tomi, Kinnunen

TL;DR
This paper investigates biases in audio anti-spoofing models, revealing class-specific differences in training dynamics and emphasizing the importance of robust bonafide class modeling for better generalization.
Contribution
It introduces loss analysis and asymmetric methods to analyze model behavior beyond silence, highlighting class-wise differences and guiding future robust modeling efforts.
Findings
Significant training dynamic differences between bona fide and spoof classes.
Silence distribution differences can serve as shortcuts in detection.
Need for focus on robust bona fide class modeling.
Abstract
Current trends in audio anti-spoofing detection research strive to improve models' ability to generalize across unseen attacks by learning to identify a variety of spoofing artifacts. This emphasis has primarily focused on the spoof class. Recently, several studies have noted that the distribution of silence differs between the two classes, which can serve as a shortcut. In this paper, we extend class-wise interpretations beyond silence. We employ loss analysis and asymmetric methodologies to move away from traditional attack-focused and result-oriented evaluations towards a deeper examination of model behaviors. Our investigations highlight the significant differences in training dynamics between the two classes, emphasizing the need for future research to focus on robust modeling of the bonafide class.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Music and Audio Processing
MethodsFocus
