Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis
Nikhil Singh, Jeff Mentch, Jerry Ng, Matthew Beveridge and, Iddo Drori

TL;DR
This paper introduces Image2Reverb, a neural network method that synthesizes realistic room impulse responses from single images, enabling acoustic simulation without physical measurements, useful for inaccessible or virtual environments.
Contribution
The work presents a novel end-to-end neural network architecture that generates plausible acoustic impulse responses directly from images, covering diverse environments and formats.
Findings
Generated IRs closely match ground truth data.
Human experts rated the synthesized IRs as plausible.
Applicable to various environments including paintings and virtual scenes.
Abstract
Measuring the acoustic characteristics of a space is often done by capturing its impulse response (IR), a representation of how a full-range stimulus sound excites it. This work generates an IR from a single image, which can then be applied to other signals using convolution, simulating the reverberant characteristics of the space shown in the image. Recording these IRs is both time-intensive and expensive, and often infeasible for inaccessible locations. We use an end-to-end neural network architecture to generate plausible audio impulse responses from single images of acoustic environments. We evaluate our method both by comparisons to ground truth data and by human expert evaluation. We demonstrate our approach by generating plausible impulse responses from diverse settings and formats including well known places, musical halls, rooms in paintings, images from animations and computer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Music and Audio Processing
