Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar,, William Laney, Andrew Owens, Alexander Richard

TL;DR
This paper introduces the Real Acoustic Fields (RAF) dataset, a comprehensive real-world audio-visual room acoustics dataset with high-quality impulse responses, multi-view images, and pose data, enabling improved evaluation and development of neural acoustic models.
Contribution
The paper provides the first densely captured real-world acoustic dataset with multi-modal data, and evaluates existing models, proposing enhancements and a sim2real approach for better real-world performance.
Findings
Existing models perform better with real-world data after fine-tuning.
Incorporating visual data improves neural acoustic field models.
Pre-training on simulated data and fine-tuning enhances few-shot learning.
Abstract
We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms. We used this dataset to evaluate existing methods for novel-view acoustic synthesis and impulse response generation which previously relied on synthetic data. In our evaluation, we thoroughly assessed existing audio and audio-visual models against multiple criteria and proposed settings to enhance their performance on real-world data. We also conducted experiments to investigate the impact of incorporating visual data (i.e., images and depth) into neural acoustic field models. Additionally, we demonstrated the effectiveness of a simple sim2real approach, where a model is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing
