Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec
Yanzhou Ren, Noboru Harada, Daiki Takeuchi, Siyu Chen, Wei Liu, Xiao Zhang, Liyuan Zhang, Takehiro Moriya, and Shoji Makino

TL;DR
This paper introduces an entropy-guided group residual vector quantization method for neural speech codecs, enhancing ultra-low bitrate speech reconstruction by balancing information content across groups to improve quality and efficiency.
Contribution
It proposes a novel entropy-guided grouping strategy for neural speech codecs that balances information across channels, improving performance at ultra-low bitrates.
Findings
Improved perceptual quality at ultra-low bitrates.
Enhanced intelligibility of reconstructed speech.
More efficient codebook utilization.
Abstract
Neural audio codec (NAC) is essential for reconstructing high-quality speech signals and generating discrete representations for downstream speech language models. However, ensuring accurate semantic modeling while maintaining high-fidelity reconstruction under ultra-low bitrate constraints remains challenging. We propose an entropy-guided group residual vector quantization (EG-GRVQ) for an ultra-low bitrate neural speech codec, which retains a semantic branch for linguistic information and incorporates an entropy-guided grouping strategy in the acoustic branch. Assuming that channel activations follow approximately Gaussian statistics, the variance of each channel can serve as a principled proxy for its information content. Based on this assumption, we partition the encoder output such that each group carries an equal share of the total information. This balanced allocation improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
