VCNAC: A Variable-Channel Neural Audio Codec for Mono, Stereo, and Surround Sound

Florian Gr\"otschla; Arunasish Sen; Alessandro Lombardi; Guillermo C\'ambara; Andreas Schwarz

arXiv:2601.14960·cs.SD·January 22, 2026

VCNAC: A Variable-Channel Neural Audio Codec for Mono, Stereo, and Surround Sound

Florian Gr\"otschla, Arunasish Sen, Alessandro Lombardi, Guillermo C\'ambara, Andreas Schwarz

PDF

Open Access

TL;DR

VCNAC introduces a unified neural audio codec capable of handling mono, stereo, and surround sound with a single model, ensuring high-quality reconstruction across various channel setups.

Contribution

It presents a novel variable-channel neural audio codec with a shared encoder-decoder architecture supporting multiple channel configurations.

Findings

01

Maintains high perceptual quality across mono, stereo, and surround sound.

02

Supports inference scalability across different audio modalities.

03

Achieves competitive objective spatial audio metrics and positive subjective listening results.

Abstract

We present VCNAC, a variable channel neural audio codec. Our approach features a single encoder and decoder parametrization that enables native inference for different channel setups, from mono speech to cinematic 5.1 channel surround audio. Channel compatibility objectives ensure that multi-channel content maintains perceptual quality when decoded to fewer channels. The shared representation enables training of generative language models on a single set of codebooks while supporting inference-time scalability across modalities and channel configurations. Evaluation using objective spatial audio metrics and subjective listening tests demonstrates that our unified approach maintains high reconstruction quality across mono, stereo, and surround audio configurations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation