SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen, Carl Schissler, Sanchit Garg, Philip Kobernik, Alexander, Clegg, Paul Calamia, Dhruv Batra, Philip W Robinson, Kristen Grauman

TL;DR
SoundSpaces 2.0 is a high-fidelity, real-time geometry-based audio simulation platform for 3D environments, enabling advanced audio-visual research and embodied learning tasks.
Contribution
It introduces a novel, fast, and realistic acoustic simulation platform that supports continuous spatial sampling and generalizes to new environments, advancing audio-visual research capabilities.
Findings
Benchmark against real-world audio measurements shows high fidelity.
Demonstrated improved performance in embodied navigation tasks.
Showcased effective sim2real transfer in automatic speech recognition.
Abstract
We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio rendering for 3D environments. Given a 3D mesh of a real-world environment, SoundSpaces can generate highly realistic acoustics for arbitrary sounds captured from arbitrary microphone locations. Together with existing 3D visual assets, it supports an array of audio-visual research tasks, such as audio-visual navigation, mapping, source localization and separation, and acoustic matching. Compared to existing resources, SoundSpaces 2.0 has the advantages of allowing continuous spatial sampling, generalization to novel environments, and configurable microphone and material properties. To our knowledge, this is the first geometry-based acoustic simulation that offers high fidelity and realism while also being fast enough to use for embodied learning. We showcase the simulator's properties and benchmark its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech and Audio Processing · Music Technology and Sound Studies · Music and Audio Processing
