Ambisonics Super-Resolution Using A Waveform-Domain Neural Network

Ismael Nawfal; Symeon Delikaris Manias; Mehrez Souden; Juha Merimaa; Joshua Atkins; Elisabeth McMullin; Shadi Pirhosseinloo; Daniel Phillips

arXiv:2508.00240·eess.AS·August 4, 2025

Ambisonics Super-Resolution Using A Waveform-Domain Neural Network

Ismael Nawfal, Symeon Delikaris Manias, Mehrez Souden, Juha Merimaa, Joshua Atkins, Elisabeth McMullin, Shadi Pirhosseinloo, Daniel Phillips

PDF

Open Access

TL;DR

This paper introduces a neural network-based method that enhances first-order Ambisonics to higher orders, improving spatial accuracy and perceived audio quality without increasing channel count.

Contribution

A novel data-driven neural network approach that converts FOA to higher-order Ambisonics, surpassing traditional physics-based renderers in quality.

Findings

01

0.6dB reduction in positional mean squared error

02

80% improvement in perceived audio quality

03

Outperforms conventional renderers in quantitative and qualitative assessments

Abstract

Ambisonics is a spatial audio format describing a sound field. First-order Ambisonics (FOA) is a popular format comprising only four channels. This limited channel count comes at the expense of spatial accuracy. Ideally one would be able to take the efficiency of a FOA format without its limitations. We have devised a data-driven spatial audio solution that retains the efficiency of the FOA format but achieves quality that surpasses conventional renderers. Utilizing a fully convolutional time-domain audio neural network (Conv-TasNet), we created a solution that takes a FOA input and provides a higher order Ambisonics (HOA) output. This data driven approach is novel when compared to typical physics and psychoacoustic based renderers. Quantitative evaluations showed a 0.6dB average positional mean squared error difference between predicted and actual 3rd order HOA. The median qualitative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Music Technology and Sound Studies