Optimization of DNN-based speaker verification model through efficient   quantization technique

Yeona Hong; Woo-Jin Chung; Hong-Goo Kang

arXiv:2407.08991·eess.AS·July 15, 2024·1 cites

Optimization of DNN-based speaker verification model through efficient quantization technique

Yeona Hong, Woo-Jin Chung, Hong-Goo Kang

PDF

Open Access

TL;DR

This paper presents a novel quantization framework for DNN-based speaker verification models that significantly reduces model size with minimal impact on performance, enabling more efficient deployment on mobile devices.

Contribution

It introduces the first quantization algorithm that maintains the accuracy of the ECAPATDNN speaker verification model while halving its size.

Findings

01

Model size reduced by 50%

02

Performance degradation limited to 0.07% EER increase

03

Effective layer-wise performance analysis for quantization

Abstract

As Deep Neural Networks (DNNs) rapidly advance in various fields, including speech verification, they typically involve high computational costs and substantial memory consumption, which can be challenging to manage on mobile systems. Quantization of deep models offers a means to reduce both computational and memory expenses. Our research proposes an optimization framework for the quantization of the speaker verification model. By analyzing performance changes and model size reductions in each layer of a pre-trained speaker verification model, we have effectively minimized performance degradation while significantly reducing the model size. Our quantization algorithm is the first attempt to maintain the performance of the state-of-the-art pre-trained speaker verification model, ECAPATDNN, while significantly compressing its model size. Overall, our quantization approach resulted in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis