Parametric Object Coding in IVAS: Efficient Coding of Multiple Audio Objects at Low Bit Rates
Andrea Eichenseer, Srikanth Korse, Guillaume Fuchs, and Markus Multrus

TL;DR
This paper presents a parametric object coding method within the IVAS codec that efficiently transmits multiple audio objects at low bit rates, maintaining spatial fidelity and immersive quality.
Contribution
It introduces a novel parametric coding approach for IVAS that reduces bit rate and complexity while preserving spatial audio quality.
Findings
Achieves spatial audio reconstruction at 24.4 and 32 kbit/s
Provides comparable immersive experience to independent object coding
Reduces bit rate and complexity in multi-object audio coding
Abstract
The recently standardized 3GPP codec for Immersive Voice and Audio Services (IVAS) includes a parametric mode for efficiently coding multiple audio objects at low bit rates. In this mode, parametric side information is obtained from both the object metadata and the input audio objects. The side information comprises directional information, indices of two dominant objects, and the power ratio between these two dominant objects. It is transmitted to the decoder along with a stereo downmix. In IVAS, parametric object coding allows for transmitting three or four arbitrarily placed objects at bit rates of 24.4 or 32 kbit/s and faithfully reconstructing the spatial image of the original audio scene. Subjective listening tests confirm that IVAS provides a comparable immersive experience at lower bit rate and complexity compared to coding the audio objects independently using Enhanced Voice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Speech and Audio Processing · Hearing Loss and Rehabilitation
