Towards One-bit ASR: Extremely Low-bit Conformer Quantization Using Co-training and Stochastic Precision
Zhaoqing Li, Haoning Xu, Zengrui Jin, Lingwei Meng, Tianzi Wang, Huimeng Wang, Youjun Chen, Mingyu Cui, Shujie Hu, Xunying Liu

TL;DR
This paper introduces novel 2-bit and 1-bit quantization methods for Conformer ASR systems, enabling significant model compression with negligible performance loss using co-training and stochastic precision techniques.
Contribution
It presents the first extremely low-bit quantization approach for Conformer ASR systems that maintains performance, leveraging co-training, stochastic precision, and learnable scaling.
Findings
Achieves lossless 2-bit and 1-bit quantization on standard speech datasets.
Attains over 16x compression ratios without significant WER increase.
Demonstrates practical viability of ultra-low-bit ASR models.
Abstract
Model compression has become an emerging need as the sizes of modern speech systems rapidly increase. In this paper, we study model weight quantization, which directly reduces the memory footprint to accommodate computationally resource-constrained applications. We propose novel approaches to perform extremely low-bit (i.e., 2-bit and 1-bit) quantization of Conformer automatic speech recognition systems using multiple precision model co-training, stochastic precision, and tensor-wise learnable scaling factors to alleviate quantization incurred performance loss. The proposed methods can achieve performance-lossless 2-bit and 1-bit quantization of Conformer ASR systems trained with the 300-hr Switchboard and 960-hr LibriSpeech corpus. Maximum overall performance-lossless compression ratios of 16.2 and 16.6 times are achieved without a statistically significant increase in the word error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Fault Detection and Control Systems · Blind Source Separation Techniques
