Hearing Anything Anywhere
Mason Wang, Ryosuke Sawata, Samuel Clarke, Ruohan Gao, Shangzhe Wu,, Jiajun Wu

TL;DR
This paper presents DiffRIR, a novel differentiable framework that reconstructs spatial acoustic properties from sparse recordings, enabling realistic auditory scene synthesis for immersive mixed reality experiences.
Contribution
We introduce DiffRIR, a differentiable RIR rendering model with interpretable parameters, capable of reconstructing acoustic environments from minimal data and synthesizing novel auditory experiences.
Findings
Outperforms state-of-the-art in RIR and music rendering
Learns physically interpretable acoustic parameters
Effective with only sparse recordings and scene reconstructions
Abstract
Recent years have seen immense progress in 3D computer vision and computer graphics, with emerging tools that can virtualize real-world 3D environments for numerous Mixed Reality (XR) applications. However, alongside immersive visual experiences, immersive auditory experiences are equally vital to our holistic perception of an environment. In this paper, we aim to reconstruct the spatial acoustic characteristics of an arbitrary environment given only a sparse set of (roughly 12) room impulse response (RIR) recordings and a planar reconstruction of the scene, a setup that is easily achievable by ordinary users. To this end, we introduce DiffRIR, a differentiable RIR rendering framework with interpretable parametric models of salient acoustic features of the scene, including sound source directivity and surface reflectivity. This allows us to synthesize novel auditory experiences through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Music Technology and Sound Studies · Speech and Audio Processing
MethodsSparse Evolutionary Training
