Impulse Response Data Augmentation and Deep Neural Networks for Blind Room Acoustic Parameter Estimation
Nicholas J. Bryan

TL;DR
This paper introduces a novel data augmentation technique for acoustic impulse responses, enabling more effective training of neural networks for blind room acoustic parameter estimation, resulting in improved accuracy and efficiency.
Contribution
The paper presents a parametric AIR augmentation method and a faster CNN baseline, advancing blind room acoustic parameter estimation with limited data.
Findings
Augmentation method effectively expands small datasets.
Proposed CNN outperforms previous state-of-the-art methods.
Faster CNN achieves comparable or better results.
Abstract
The reverberation time (T60) and the direct-to-reverberant ratio (DRR) are commonly used to characterize room acoustic environments. Both parameters can be measured from an acoustic impulse response (AIR) or using blind estimation methods that perform estimation directly from speech. When neural networks are used for blind estimation, however, a large realistic dataset is needed, which is expensive and time consuming to collect. To address this, we propose an AIR augmentation method that can parametrically control the T60 and DRR, allowing us to expand a small dataset of real AIRs into a balanced dataset orders of magnitude larger. Using this method, we train a previously proposed convolutional neural network (CNN) and show we can outperform past single-channel state-of-the-art methods. We then propose a more efficient, straightforward baseline CNN that is 4-5x faster, which provides an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
