TL;DR
This paper assesses the robustness of state-of-the-art audio spoof detection methods against laundering attacks, revealing significant performance degradation and highlighting the need for more resilient detection techniques.
Contribution
It introduces a new laundering attack database and evaluates existing SOTA spoof detection approaches, exposing their vulnerabilities to laundering attacks.
Findings
SOTA systems perform poorly against laundering attacks
Reverberation and noise attacks significantly degrade detection accuracy
Highlights the need for developing robust spoof detection methods
Abstract
Voice-cloning (VC) systems have seen an exceptional increase in the realism of synthesized speech in recent years. The high quality of synthesized speech and the availability of low-cost VC services have given rise to many potential abuses of this technology. Several detection methodologies have been proposed over the years that can detect voice spoofs with reasonably good accuracy. However, these methodologies are mostly evaluated on clean audio databases, such as ASVSpoof 2019. This paper evaluates SOTA Audio Spoof Detection approaches in the presence of laundering attacks. In that regard, a new laundering attack database, called the ASVSpoof Laundering Database, is created. This database is based on the ASVSpoof 2019 (LA) eval database comprising a total of 1388.22 hours of audio recordings. Seven SOTA audio spoof detection approaches are evaluated on this laundered database. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
