Spoken Pass-Phrase Verification in the i-vector Space
Hossein Zeinali, Lukas Burget, Hossein Sameti, Jan Cernocky

TL;DR
This paper demonstrates that simple cosine scoring of i-vectors, extracted with phrase-specific HMMs or DNN features, effectively verifies spoken pass-phrases and outperforms previous methods on standard datasets.
Contribution
It applies i-vector extraction techniques to pass-phrase verification, showing that straightforward scoring methods yield superior results compared to prior approaches.
Findings
Cosine scoring of i-vectors achieves high verification accuracy.
Phrase-specific HMMs and DNN features improve pass-phrase rejection.
Results surpass previous published performances on RSR2015 and RedDots datasets.
Abstract
The task of spoken pass-phrase verification is to decide whether a test utterance contains the same phrase as given enrollment utterances. Beside other applications, pass-phrase verification can complement an independent speaker verification subsystem in text-dependent speaker verification. It can also be used for liveness detection by verifying that the user is able to correctly respond to a randomly prompted phrase. In this paper, we build on our previous work on i-vector based text-dependent speaker verification, where we have shown that i-vectors extracted using phrase specific Hidden Markov Models (HMMs) or using Deep Neural Network (DNN) based bottle-neck (BN) features help to reject utterances with wrong pass-phrases. We apply the same i-vector extraction techniques to the stand-alone task of speaker-independent spoken pass-phrase classification and verification. The experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing
