A Continuous Liveness Detection System for Text-independent Speaker Verification
Linghan Zhang, Jie Yang

TL;DR
VoiceLive is a practical, hardware-free liveness detection system for voice authentication on smartphones that leverages stereo recordings and device-user interaction to prevent spoofing, achieving over 99% accuracy.
Contribution
The paper introduces VoiceLive, a novel smartphone-based liveness detection system utilizing stereo TDoA dynamics and device positioning without extra hardware.
Findings
Achieves over 99% detection accuracy and 1-2% EER.
Robust across different phone positions and angles.
Effective for both text-dependent and text-independent voice verification.
Abstract
Voice authentication is drawing increasing attention and becomes an attractive alternative to passwords for mobile authentication. Recent advances in mobile technology further accelerate the adoption of voice biometrics in an array of diverse mobile applications. However, recent studies show that voice authentication is vulnerable to replay attacks, where an adversary can spoof a voice authentication system using a pre-recorded voice sample collected from the victim. In this paper, we propose VoiceLive, a liveness detection system for both text-dependent and text-independent voice authentication on smartphones. VoiceLive detects a live user by leveraging the user's unique vocal system and the stereo recording of smartphones. In particular, utilizing the built-in gyroscope, loudspeaker, and microphone, VoiceLive first measures the smartphone's distance and angle from the user, then it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
