Ambisonics Super-Resolution Using A Waveform-Domain Neural Network
Ismael Nawfal, Symeon Delikaris Manias, Mehrez Souden, Juha Merimaa, Joshua Atkins, Elisabeth McMullin, Shadi Pirhosseinloo, Daniel Phillips

TL;DR
This paper introduces a neural network-based method that enhances first-order Ambisonics to higher orders, improving spatial accuracy and perceived audio quality without increasing channel count.
Contribution
A novel data-driven neural network approach that converts FOA to higher-order Ambisonics, surpassing traditional physics-based renderers in quality.
Findings
0.6dB reduction in positional mean squared error
80% improvement in perceived audio quality
Outperforms conventional renderers in quantitative and qualitative assessments
Abstract
Ambisonics is a spatial audio format describing a sound field. First-order Ambisonics (FOA) is a popular format comprising only four channels. This limited channel count comes at the expense of spatial accuracy. Ideally one would be able to take the efficiency of a FOA format without its limitations. We have devised a data-driven spatial audio solution that retains the efficiency of the FOA format but achieves quality that surpasses conventional renderers. Utilizing a fully convolutional time-domain audio neural network (Conv-TasNet), we created a solution that takes a FOA input and provides a higher order Ambisonics (HOA) output. This data driven approach is novel when compared to typical physics and psychoacoustic based renderers. Quantitative evaluations showed a 0.6dB average positional mean squared error difference between predicted and actual 3rd order HOA. The median qualitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Music Technology and Sound Studies
