ClariCodec: Optimising Neural Speech Codes for 200bps Communication using Reinforcement Learning

Junyi Wang; Chi Zhang; Jing Qian; Haifeng Luo; Hao Wang; Zengrui Jin; Chao Zhang

arXiv:2604.14654·cs.SD·April 21, 2026

ClariCodec: Optimising Neural Speech Codes for 200bps Communication using Reinforcement Learning

Junyi Wang, Chi Zhang, Jing Qian, Haifeng Luo, Hao Wang, Zengrui Jin, Chao Zhang

PDF

TL;DR

ClariCodec is a neural speech codec optimized at 200bps using reinforcement learning to improve intelligibility, achieving significant WER reductions while maintaining perceptual quality.

Contribution

It introduces a novel RL-based training method for ultra-low bitrate speech codecs, enhancing intelligibility at extreme compression levels.

Findings

01

Achieves 3.68% WER at 200 bps without RL.

02

RL fine-tuning reduces WER to 3.20% on test-clean.

03

Maintains perceptual quality despite lower bitrates.

Abstract

In bandwidth-constrained communication such as satellite and underwater channels, speech must often be transmitted at ultra-low bitrates where intelligibility is the primary objective. At such extreme compression levels, codecs trained with acoustic reconstruction losses tend to allocate bits to perceptual detail, leading to substantial degradation in word error rate (WER). This paper proposes ClariCodec, a neural speech codec operating at 200 bit per second (bps) that reformulates quantisation as a stochastic policy, enabling reinforcement learning (RL)-based optimisation of intelligibility. Specifically, the encoder is fine-tuned using WER-driven rewards while the acoustic reconstruction pipeline remains frozen. Even without RL, ClariCodec achieves 3.68% WER on the LibriSpeech test-clean set at 200 bps, already competitive with codecs operating at higher bitrates. Further RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.