Binary Neural Network for Speaker Verification
Tinglong Zhu, Xiaoyi Qin, Ming Li

TL;DR
This paper introduces a binary neural network approach for speaker verification that maintains high accuracy while significantly reducing memory and computational costs, suitable for low-resource devices.
Contribution
The paper presents a novel application of binarized neural networks to speaker verification, achieving comparable or better performance with much lower resource requirements.
Findings
ResNet34-based binary network achieves 5% EER on Voxceleb1.
Binarized network outperforms traditional networks on text-dependent dataset.
32x memory savings with maintained accuracy.
Abstract
Although deep neural networks are successful for many tasks in the speech domain, the high computational and memory costs of deep neural networks make it difficult to directly deploy highperformance Neural Network systems on low-resource embedded devices. There are several mechanisms to reduce the size of the neural networks i.e. parameter pruning, parameter quantization, etc. This paper focuses on how to apply binary neural networks to the task of speaker verification. The proposed binarization of training parameters can largely maintain the performance while significantly reducing storage space requirements and computational costs. Experiment results show that, after binarizing the Convolutional Neural Network, the ResNet34-based network achieves an EER of around 5% on the Voxceleb1 testing dataset and even outperforms the traditional real number network on the text-dependent dataset:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
