Speech dereverberation constrained on room impulse response   characteristics

Louis Bahrman (S2A; IDS); Mathieu Fontaine (S2A; IDS); Jonathan Le; Roux (MERL); Ga\"el Richard (S2A; IDS)

arXiv:2407.08657·cs.SD·July 12, 2024

Speech dereverberation constrained on room impulse response characteristics

Louis Bahrman (S2A, IDS), Mathieu Fontaine (S2A, IDS), Jonathan Le, Roux (MERL), Ga\"el Richard (S2A, IDS)

PDF

Open Access

TL;DR

This paper introduces a physically interpretable deep learning approach for single-channel speech dereverberation that regularizes the model with a novel room impulse response coherence loss, ensuring the dereverberated speech aligns with room acoustics.

Contribution

It proposes a new physical coherence loss to regularize dereverberation models, making them more interpretable and physically consistent with room acoustics.

Findings

01

Enhanced physical coherence of dereverberated signals

02

Preservation of original dereverberation quality

03

Improved interpretability of deep learning models

Abstract

Single-channel speech dereverberation aims at extracting a dry speech signal from a recording affected by the acoustic reflections in a room. However, most current deep learning-based approaches for speech dereverberation are not interpretable for room acoustics, and can be considered as black-box systems in that regard. In this work, we address this problem by regularizing the training loss using a novel physical coherence loss which encourages the room impulse response (RIR) induced by the dereverberated output of the model to match the acoustic properties of the room in which the signal was recorded. Our investigation demonstrates the preservation of the original dereverberated signal alongside the provision of a more physically coherent RIR.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques