Model Compression for DNN-based Speaker Verification Using Weight   Quantization

Jingyu Li; Wei Liu; Zhaoyang Zhang; Jiong Wang; Tan Lee

arXiv:2210.17326·eess.AS·September 26, 2023

Model Compression for DNN-based Speaker Verification Using Weight Quantization

Jingyu Li, Wei Liu, Zhaoyang Zhang, Jiong Wang, Tan Lee

PDF

Open Access

TL;DR

This paper demonstrates that weight quantization effectively compresses DNN-based speaker verification models like ECAPA-TDNN and ResNet, significantly reducing size with minimal performance loss, especially in ResNet due to its smooth weight distribution.

Contribution

The study applies weight quantization to compress popular SV models and analyzes their robustness and knowledge retention, highlighting ResNet's superior robustness due to its weight distribution.

Findings

01

Model size reduced multiple times without performance degradation

02

ResNet shows more robust compression than ECAPA-TDNN

03

Quantized models retain most speaker-relevant knowledge

Abstract

DNN-based speaker verification (SV) models demonstrate significant performance at relatively high computation costs. Model compression can be applied to reduce the model size for lower resource consumption. The present study exploits weight quantization to compress two widely-used SV models, namely ECAPA-TDNN and ResNet. Experimental results on VoxCeleb show that weight quantization is effective for compressing SV models. The model size can be reduced multiple times without noticeable degradation in performance. Compression of ResNet shows more robust results than ECAPA-TDNN with lower-bitwidth quantization. Analysis of the layer weights suggests that the smooth weight distribution of ResNet may be related to its better robustness. The generalization ability of the quantized model is validated via a language-mismatched SV task. Furthermore, analysis by information probing reveals that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Max Pooling · Average Pooling · Residual Connection · Bottleneck Residual Block · Residual Block · Convolution · Global Average Pooling