A Universal Deep Room Acoustics Estimator

Paula S\'anchez L\'opez; Paul Callens; Milos Cernak

arXiv:2109.14436·eess.AS·April 5, 2022

A Universal Deep Room Acoustics Estimator

Paula S\'anchez L\'opez, Paul Callens, Milos Cernak

PDF

TL;DR

This paper introduces a universal deep learning model that accurately estimates key room acoustic parameters directly from speech audio, even in challenging non-stationary noise conditions, outperforming existing methods.

Contribution

The paper presents a novel convolutional recurrent neural network that jointly estimates multiple room acoustic parameters without requiring room impulse response measurements.

Findings

01

Outperforms current state-of-the-art methods

02

Robust to non-stationary noise variations

03

Accurately estimates multiple acoustic parameters

Abstract

Speech audio quality is subject to degradation caused by an acoustic environment and isotropic ambient and point noises. The environment can lead to decreased speech intelligibility and loss of focus and attention by the listener. Basic acoustic parameters that characterize the environment well are (i) signal-to-noise ratio (SNR), (ii) speech transmission index, (iii) reverberation time, (iv) clarity, and (v) direct-to-reverberant ratio. Except for the SNR, these parameters are usually derived from the Room Impulse Response (RIR) measurements; however, such measurements are often not available. This work presents a universal room acoustic estimator design based on convolutional recurrent neural networks that estimate the acoustic environment measurement blindly and jointly. Our results indicate that the proposed system is robust to non-stationary signal variations and outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.