SACodec: Asymmetric Quantization with Semantic Anchoring for Low-Bitrate High-Fidelity Neural Speech Codecs

Zhongren Dong; Bin Wang; Jing Han; Haotian Guo; Xiaojun Mo; Yimin Cao; Zixing Zhang

arXiv:2512.20944·cs.SD·December 25, 2025

SACodec: Asymmetric Quantization with Semantic Anchoring for Low-Bitrate High-Fidelity Neural Speech Codecs

Zhongren Dong, Bin Wang, Jing Han, Haotian Guo, Xiaojun Mo, Yimin Cao, Zixing Zhang

PDF

Open Access 1 Video

TL;DR

SACodec introduces an asymmetric dual-quantizer with semantic anchoring to improve low-bitrate neural speech codecs, achieving high fidelity and semantic richness at just 1.5 kbps.

Contribution

It proposes a novel asymmetric quantization framework with semantic anchoring, decoupling semantic and acoustic encoding for enhanced low-bitrate speech quality.

Findings

01

Achieves state-of-the-art performance at 1.5 kbps.

02

Reconstructs audio with high perceptual quality.

03

Enhances semantic richness in downstream tasks.

Abstract

Neural Speech Codecs face a fundamental trade-off at low bitrates: preserving acoustic fidelity often compromises semantic richness. To address this, we introduce SACodec, a novel codec built upon an asymmetric dual-quantizer that employs our proposed Semantic Anchoring mechanism. This design strategically decouples the quantization of Semantic and Acoustic details. The semantic anchoring is achieved via a lightweight projector that aligns acoustic features with a frozen, large-scale mHuBERT codebook, injecting linguistic priors while guaranteeing full codebook utilization. Sequentially, for acoustic details, a residual activation module with SimVQ enables a single-layer quantizer (acoustic path) to faithfully recover fine-grained information. At just 1.5 kbps, SACodec establishes a new state of the art by excelling in both fidelity and semantics: subjective listening tests confirm that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SACodec: Asymmetric Quantization with Semantic Anchoring for Low-Bitrate High-Fidelity Neural Speech Codecs· underline

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Hearing Loss and Rehabilitation