SuperVoice: Text-Independent Speaker Verification Using Ultrasound   Energy in Human Speech

Hanqing Guo; Qiben Yan; Nikolay Ivanov; Ying Zhu; Li Xiao; Eric J.; Hunter

arXiv:2205.14496·cs.SD·May 31, 2022·1 cites

SuperVoice: Text-Independent Speaker Verification Using Ultrasound Energy in Human Speech

Hanqing Guo, Qiben Yan, Nikolay Ivanov, Ying Zhu, Li Xiao, Eric J., Hunter

PDF

Open Access

TL;DR

SuperVoice leverages ultrasound frequency components of human speech to significantly improve speaker verification accuracy and security against spoofing attacks, outperforming existing systems with rapid processing times.

Contribution

This paper introduces SUPERVOICE, a novel speaker verification system utilizing ultrasound speech features and a two-stream DNN architecture for enhanced security and speed.

Findings

01

Achieves 0.58% equal error rate in speaker verification

02

Detects replay attacks with 0% error within 91 ms

03

Outperforms existing verification systems in accuracy and speed

Abstract

Voice-activated systems are integrated into a variety of desktop, mobile, and Internet-of-Things (IoT) devices. However, voice spoofing attacks, such as impersonation and replay attacks, in which malicious attackers synthesize the voice of a victim or simply replay it, have brought growing security concerns. Existing speaker verification techniques distinguish individual speakers via the spectrographic features extracted from an audible frequency range of voice commands. However, they often have high error rates and/or long delays. In this paper, we explore a new direction of human voice research by scrutinizing the unique characteristics of human speech at the ultrasound frequency band. Our research indicates that the high-frequency ultrasound components (e.g. speech fricatives) from 20 to 48 kHz can significantly enhance the security and accuracy of speaker verification. We propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing