Novel-View Acoustic Synthesis
Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna, Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi

TL;DR
This paper introduces the novel-view acoustic synthesis task, proposing a neural rendering approach to synthesize scene sounds from unseen viewpoints using audio-visual cues, supported by new datasets and promising results.
Contribution
It formulates the first novel-view acoustic synthesis task, introduces the ViGAS neural network, and provides large-scale datasets for benchmarking this new problem.
Findings
Model successfully synthesizes faithful audio from unseen viewpoints.
Proposed datasets enable benchmarking of novel-view acoustic synthesis.
First to address this multi-modal, multi-view audio-visual synthesis challenge.
Abstract
We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint? We propose a neural rendering approach: Visually-Guided Acoustic Synthesis (ViGAS) network that learns to synthesize the sound of an arbitrary point in space by analyzing the input audio-visual cues. To benchmark this task, we collect two first-of-their-kind large-scale multi-view audio-visual datasets, one synthetic and one real. We show that our model successfully reasons about the spatial cues and synthesizes faithful audio on both datasets. To our knowledge, this work represents the very first formulation, dataset, and approach to solve the novel-view acoustic synthesis task, which has exciting potential applications ranging from AR/VR to art and design. Unlocked by this work, we believe that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Hearing Loss and Rehabilitation · Advanced Vision and Imaging
