The Impact of Silence on Speech Anti-Spoofing

Yuxiang Zhang; Zhuo Li; Jingze Lu; Hua Hua; Wenchao Wang; Pengyuan; Zhang

arXiv:2309.11827·eess.AS·September 22, 2023

The Impact of Silence on Speech Anti-Spoofing

Yuxiang Zhang, Zhuo Li, Jingze Lu, Hua Hua, Wenchao Wang, Pengyuan, Zhang

PDF

Open Access

TL;DR

This paper investigates how silence affects speech anti-spoofing systems, revealing that silence content and duration significantly influence detection accuracy, and proposes methods to improve robustness against silence-related attacks.

Contribution

The study analyzes the impact of silence on anti-spoofing models, visualizes attention distribution, and proposes masking silence to enhance robustness against unknown spoofing attacks.

Findings

01

Silence duration is lower in TTS spoof speech compared to bonafide speech.

02

Removing silence increases error rates in neural TTS spoof detection.

03

Masking silence improves model robustness against certain spoofing attacks.

Abstract

The current speech anti-spoofing countermeasures (CMs) show excellent performance on specific datasets. However, removing the silence of test speech through Voice Activity Detection (VAD) can severely degrade performance. In this paper, the impact of silence on speech anti-spoofing is analyzed. First, the reasons for the impact are explored, including the proportion of silence duration and the content of silence. The proportion of silence duration in spoof speech generated by text-to-speech (TTS) algorithms is lower than that in bonafide speech. And the content of silence generated by different waveform generators varies compared to bonafide speech. Then the impact of silence on model prediction is explored. Even after retraining, the spoof speech generated by neural network based end-to-end TTS algorithms suffers a significant rise in error rates when the silence is removed. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders