The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi

TL;DR
This paper introduces the PartialSpoof database and a new countermeasure that detects and localizes short fake speech segments embedded within genuine utterances, enhancing spoof detection accuracy at finer temporal resolutions.
Contribution
It presents a novel database with segment labels at multiple temporal resolutions and a countermeasure capable of simultaneous utterance- and segment-level spoof detection.
Findings
Achieved low error rates in detecting spoofed segments at various temporal resolutions.
Demonstrated effectiveness of self-supervised models as feature extractors.
Outperformed existing methods in both PartialSpoof and ASVspoof 2019 LA scenarios.
Abstract
Automatic speaker verification is susceptible to various manipulations and spoofing, such as text-to-speech synthesis, voice conversion, replay, tampering, adversarial attacks, and so on. We consider a new spoofing scenario called "Partial Spoof" (PS) in which synthesized or transformed speech segments are embedded into a bona fide utterance. While existing countermeasures (CMs) can detect fully spoofed utterances, there is a need for their adaptation or extension to the PS scenario. We propose various improvements to construct a significantly more accurate CM that can detect and locate short-generated spoofed speech segments at finer temporal resolutions. First, we introduce newly developed self-supervised pre-trained models as enhanced feature extractors. Second, we extend our PartialSpoof database by adding segment labels for various temporal resolutions. Since the short spoofed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
