Neural Sound Field Decomposition with Super-resolution of Sound   Direction

Qiuqiang Kong; Shilei Liu; Junjie Shi; Xuzhou Ye; Yin Cao; Qiaoxi Zhu,; Yong Xu; Yuxuan Wang

arXiv:2210.12345·cs.SD·October 25, 2022

Neural Sound Field Decomposition with Super-resolution of Sound Direction

Qiuqiang Kong, Shilei Liu, Junjie Shi, Xuzhou Ye, Yin Cao, Qiaoxi Zhu,, Yong Xu, Yuxuan Wang

PDF

Open Access

TL;DR

This paper introduces a neural network-based framework for high-resolution sound field decomposition from limited microphone data, significantly improving spatial resolution and source localization accuracy over traditional methods.

Contribution

The paper presents a novel learning-based approach for sound field decomposition that achieves super-resolution in spatial directions using neural networks and limited microphone inputs.

Findings

01

NeSD outperforms Ambisonics and DOANet in decomposition accuracy.

02

NeSD improves source localization on speech, music, and sound events.

03

NeSD demonstrates effective super-resolution of sound directions.

Abstract

Sound field decomposition predicts waveforms in arbitrary directions using signals from a limited number of microphones as inputs. Sound field decomposition is fundamental to downstream tasks, including source localization, source separation, and spatial audio reproduction. Conventional sound field decomposition methods such as Ambisonics have limited spatial decomposition resolution. This paper proposes a learning-based Neural Sound field Decomposition (NeSD) framework to allow sound field decomposition with fine spatial direction resolution, using recordings from microphone capsules of a few microphones at arbitrary positions. The inputs of a NeSD system include microphone signals, microphone positions, and queried directions. The outputs of a NeSD include the waveform and the presence probability of a queried position. We model the NeSD systems respectively with different neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Underwater Acoustics Research · Blind Source Separation Techniques