Blind Room Parameter Estimation Using Multiple-Multichannel Speech Recordings
Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

TL;DR
This paper introduces a neural network-based method for blind estimation of room parameters like size and reverberation characteristics from multi-channel speech recordings, improving accuracy over previous methods.
Contribution
It presents a novel neural network architecture that jointly estimates multiple room parameters from multi-position, multi-channel noisy speech data, trained on a large simulated dataset.
Findings
Using multiple observations reduces estimation errors.
Two-channel data improves surface and volume estimation.
The proposed model outperforms existing blind volume estimation methods.
Abstract
Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics. In this paper, we study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room in a blind fashion, based on two-channel noisy speech recordings from multiple, unknown source-receiver positions. A novel convolutional neural network architecture leveraging both single- and inter-channel cues is proposed and trained on a large, realistic simulated dataset. Results on both simulated and real data show that using multiple observations in one room significantly reduces estimation errors and variances on all target quantities, and that using two channels helps the estimation of surface and volume. The proposed model outperforms a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Blind Source Separation Techniques
