A Universal Deep Room Acoustics Estimator
Paula S\'anchez L\'opez, Paul Callens, Milos Cernak

TL;DR
This paper introduces a universal deep learning model that accurately estimates key room acoustic parameters directly from speech audio, even in challenging non-stationary noise conditions, outperforming existing methods.
Contribution
The paper presents a novel convolutional recurrent neural network that jointly estimates multiple room acoustic parameters without requiring room impulse response measurements.
Findings
Outperforms current state-of-the-art methods
Robust to non-stationary noise variations
Accurately estimates multiple acoustic parameters
Abstract
Speech audio quality is subject to degradation caused by an acoustic environment and isotropic ambient and point noises. The environment can lead to decreased speech intelligibility and loss of focus and attention by the listener. Basic acoustic parameters that characterize the environment well are (i) signal-to-noise ratio (SNR), (ii) speech transmission index, (iii) reverberation time, (iv) clarity, and (v) direct-to-reverberant ratio. Except for the SNR, these parameters are usually derived from the Room Impulse Response (RIR) measurements; however, such measurements are often not available. This work presents a universal room acoustic estimator design based on convolutional recurrent neural networks that estimate the acoustic environment measurement blindly and jointly. Our results indicate that the proposed system is robust to non-stationary signal variations and outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
