RIR-Mega: a large-scale simulated room impulse response dataset for machine learning and room acoustics modeling
Mandip Goswami

TL;DR
RIR-Mega is a comprehensive, large-scale simulated room impulse response dataset designed to facilitate machine learning applications and room acoustics research, with tools for validation and baseline performance benchmarks.
Contribution
The paper introduces RIR-Mega, a large, publicly available simulated RIR dataset with a standardized metadata schema and baseline tools for machine learning and acoustics modeling.
Findings
Baseline regression model achieves MAE of 0.013 s
Dataset includes 50,000 RIRs with diverse configurations
Tools for validation and reuse are provided
Abstract
Room impulse responses are a core resource for dereverberation, robust speech recognition, source localization, and room acoustics estimation. We present RIR-Mega, a large collection of simulated RIRs described by a compact, machine friendly metadata schema and distributed with simple tools for validation and reuse. The dataset ships with a Hugging Face Datasets loader, scripts for metadata checks and checksums, and a reference regression baseline that predicts RT60 like targets from waveforms. On a train and validation split of 36,000 and 4,000 examples, a small Random Forest on lightweight time and spectral features reaches a mean absolute error near 0.013 s and a root mean square error near 0.022 s. We host a subset with 1,000 linear array RIRs and 3,000 circular array RIRs on Hugging Face for streaming and quick tests, and preserve the complete 50,000 RIR archive on Zenodo. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Face recognition and analysis
