Speech is Silver, Silence is Golden: What do ASVspoof-trained Models   Really Learn?

Nicolas M. M\"uller; Franziska Dieckmann; Pavel Czempin; Roman Canals,; Konstantin B\"ottinger; Jennifer Williams

arXiv:2106.12914·cs.SD·September 29, 2021

Speech is Silver, Silence is Golden: What do ASVspoof-trained Models Really Learn?

Nicolas M. M\"uller, Franziska Dieckmann, Pavel Czempin, Roman Canals,, Konstantin B\"ottinger, Jennifer Williams

PDF

TL;DR

This paper analyzes a dataset artifact in ASVspoof challenges where silence duration correlates with spoofing labels, revealing models may rely on silence duration rather than genuine speech features, affecting spoof detection reliability.

Contribution

The study uncovers the influence of silence duration artifacts in the dataset and demonstrates how models trained on silence features can achieve high accuracy, highlighting potential biases in spoof detection models.

Findings

01

Models trained solely on silence duration achieve up to 85% accuracy.

02

Silence trimming during preprocessing significantly worsens model performance.

03

Silence duration correlates with spoofing labels, affecting system score interpretation.

Abstract

We present our analysis of a significant data artifact in the official 2019/2021 ASVspoof Challenge Dataset. We identify an uneven distribution of silence duration in the training and test splits, which tends to correlate with the target prediction label. Bonafide instances tend to have significantly longer leading and trailing silences than spoofed instances. In this paper, we explore this phenomenon and its impact in depth. We compare several types of models trained on a) only the duration of the leading silence and b) only on the duration of leading and trailing silence. Results show that models trained on only the duration of the leading silence perform particularly well, and achieve up to 85% percent accuracy and an equal error rate (EER) of 15.1%. At the same time, we observe that trimming silence during pre-processing and then training established antispoofing models using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.