The SYSU System for the Interspeech 2015 Automatic Speaker Verification   Spoofing and Countermeasures Challenge

Shitao Weng; Shushan Chen; Lei Yu; Xuewei Wu; Weicheng Cai; Zhi Liu,; Ming Li

arXiv:1507.06711·cs.SD·July 30, 2015·5 cites

The SYSU System for the Interspeech 2015 Automatic Speaker Verification Spoofing and Countermeasures Challenge

Shitao Weng, Shushan Chen, Lei Yu, Xuewei Wu, Weicheng Cai, Zhi Liu,, Ming Li

PDF

Open Access

TL;DR

This paper presents a multi-feature fusion system using i-vector subsystems and classifiers to detect spoofed speech in speaker verification, achieving very low error rates on the INTERSPEECH 2015 challenge dataset.

Contribution

It introduces a novel fusion approach combining acoustic, phase, and phonetic features with multiple classifiers for improved spoofing detection.

Findings

01

Achieved 0.29% EER on development set

02

Achieved 3.26% EER on test set

03

Enhanced performance through feature and score fusion

Abstract

Many existing speaker verification systems are reported to be vulnerable against different spoofing attacks, for example speaker-adapted speech synthesis, voice conversion, play back, etc. In order to detect these spoofed speech signals as a countermeasure, we propose a score level fusion approach with several different i-vector subsystems. We show that the acoustic level Mel-frequency cepstral coefficients (MFCC) features, the phase level modified group delay cepstral coefficients (MGDCC) and the phonetic level phoneme posterior probability (PPP) tandem features are effective for the countermeasure. Furthermore, feature level fusion of these features before i-vector modeling also enhance the performance. A polynomial kernel support vector machine is adopted as the supervised classifier. In order to enhance the generalizability of the countermeasure, we also adopted the cosine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing