The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs

Samir Sadok; Laurent Girin; Xavier Alameda-Pineda

arXiv:2602.15491·cs.SD·February 18, 2026

The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs

Samir Sadok, Laurent Girin, Xavier Alameda-Pineda

PDF

Open Access

TL;DR

This paper introduces shape-gain decomposition into neural audio codecs, improving bitrate-distortion performance and reducing complexity by separately encoding gain and shape, inspired by classical speech coding techniques.

Contribution

The paper proposes a novel shape-gain decomposition method for neural audio codecs, enhancing robustness and efficiency by separating gain and shape processing.

Findings

01

Significant bitrate-distortion improvements achieved

02

Massive reduction in computational complexity

03

Enhanced robustness to input signal level variations

Abstract

Neural audio codecs (NACs) typically encode the short-term energy (gain) and normalized structure (shape) of speech/audio signals jointly within the same latent space. As a result, they are poorly robust to a global variation of the input signal level in the sense that such variation has strong influence on the embedding vectors at the output of the encoder and their quantization. This methodology is inherently inefficient, leading to codebook redundancy and suboptimal bitrate-distortion performance. To address these limitations, we propose to introduce shape-gain decomposition, widely used in classical speech/audio coding, into the NAC framework. The principle of the proposed Equalizer methodology is to decompose the input signal -- before the NAC encoder -- into gain and normalized shape vector on a short-term basis. The shape vector is processed by the NAC, while the gain is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques