Universal Spatial Audio Transcoder
Amaia Sagasti, Davide Scaini, Daniel Arteaga

TL;DR
This paper introduces USAT, a universal spatial audio transcoder that optimally converts and decodes various spatial audio formats to any loudspeaker setup, enhancing spatial information preservation.
Contribution
The paper presents USAT, a novel psychoacoustically optimized algorithm capable of transcoding any spatial audio format to any output configuration, with an open source implementation.
Findings
USAT outperforms common existing methods in preserving spatial information.
USAT effectively transcodes multiple audio formats to various loudspeaker layouts.
The approach is validated through examples demonstrating superior performance.
Abstract
This paper addresses the challenges associated with both the conversion between different spatial audio formats and the decoding of a spatial audio format to a specific loudspeaker layout. Existing approaches often rely on layout remapping tools, which may not guarantee optimal conversion from a psychoacoustic perspective. To overcome these challenges, we present the Universal Spatial Audio Transcoder (USAT) method and its corresponding open source implementation. USAT generates an optimal decoder or transcoder for any input spatial audio format, adapting it to any output format or 2D/3D loudspeaker configuration. Drawing upon optimization techniques based on psychoacoustic principles, the algorithm maximizes the preservation of spatial information. We present examples of the decoding and transcoding of several audio formats, and show that USAT approach is advantageous compared to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Music Technology and Sound Studies
