Speech dereverberation constrained on room impulse response characteristics
Louis Bahrman (S2A, IDS), Mathieu Fontaine (S2A, IDS), Jonathan Le, Roux (MERL), Ga\"el Richard (S2A, IDS)

TL;DR
This paper introduces a physically interpretable deep learning approach for single-channel speech dereverberation that regularizes the model with a novel room impulse response coherence loss, ensuring the dereverberated speech aligns with room acoustics.
Contribution
It proposes a new physical coherence loss to regularize dereverberation models, making them more interpretable and physically consistent with room acoustics.
Findings
Enhanced physical coherence of dereverberated signals
Preservation of original dereverberation quality
Improved interpretability of deep learning models
Abstract
Single-channel speech dereverberation aims at extracting a dry speech signal from a recording affected by the acoustic reflections in a room. However, most current deep learning-based approaches for speech dereverberation are not interpretable for room acoustics, and can be considered as black-box systems in that regard. In this work, we address this problem by regularizing the training loss using a novel physical coherence loss which encourages the room impulse response (RIR) induced by the dereverberated output of the model to match the acoustic properties of the room in which the signal was recorded. Our investigation demonstrates the preservation of the original dereverberated signal alongside the provision of a more physically coherent RIR.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques
