Scene-wide Acoustic Parameter Estimation
Ricardo Falcon-Perez, Ruohan Gao, Gregor Mueckl, Sebastia V. Amengual Gari, Ishwarya Ananthabhotla

TL;DR
This paper introduces a novel method to estimate spatially-distributed acoustic parameters across entire scenes from simple scene representations, enhancing AR/VR audio realism without complex RIR measurements.
Contribution
It proposes an image-to-image translation approach conditioned on calibration RIRs to infer acoustic parameters from 2D floor plans, including directionally-dependent data, and provides a new large dataset.
Findings
The method outperforms statistical baselines in estimating acoustic parameters.
It successfully predicts directionally-dependent acoustic parameters.
The approach is validated on a new 1000-room dataset.
Abstract
For augmented (AR) and virtual reality (VR) applications, accurate estimates of the acoustic characteristics of a scene are critical for creating a sense of immersion. However, directly estimating Room-impulse Responses (RIRs) from scene geometry is often a challenging, data-expensive task. We propose a method to instead infer spatially-distributed acoustic parameters (such as C50, T60, etc) for an entire scene from lightweight information readily available in an AR/VR context. We consider an image-to-image translation task to transform a 2D floormap, conditioned on a calibration RIR measurement, into 2D heatmaps of acoustic parameters. Moreover, we show that the method also works for directionally-dependent (i.e. beamformed) parameter prediction. We introduce and release a 1000-room, complex-scene dataset to study the task, and demonstrate improvements over strong statistical baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing
MethodsFocus
