A Hybrid Model for Weakly-Supervised Speech Dereverberation

Louis Bahrman (S2A; IDS); Mathieu Fontaine (S2A; IDS); Gael Richard; (S2A; IDS)

arXiv:2502.06839·eess.AS·February 12, 2025

A Hybrid Model for Weakly-Supervised Speech Dereverberation

Louis Bahrman (S2A, IDS), Mathieu Fontaine (S2A, IDS), Gael Richard, (S2A, IDS)

PDF

TL;DR

This paper presents a hybrid training approach for speech dereverberation that leverages minimal acoustic information and reverberation matching loss, resulting in more consistent performance across multiple metrics.

Contribution

It introduces a novel training strategy using limited acoustic data and reverberation matching loss, reducing reliance on paired dry/wet data and improving dereverberation results.

Findings

01

Outperforms state-of-the-art methods on multiple objective metrics.

02

Uses reverberation time (RT60) for training, reducing data requirements.

03

Achieves consistent dereverberation performance across various metrics.

Abstract

This paper introduces a new training strategy to improve speech dereverberation systems using minimal acoustic information and reverberant (wet) speech. Most existing algorithms rely on paired dry/wet data, which is difficult to obtain, or on target metrics that may not adequately capture reverberation characteristics and can lead to poor results on non-target metrics. Our approach uses limited acoustic information, like the reverberation time (RT60), to train a dereverberation system. The system's output is resynthesized using a generated room impulse response and compared with the original reverberant speech, providing a novel reverberation matching loss replacing the standard target metrics. During inference, only the trained dereverberation model is used. Experimental results demonstrate that our method achieves more consistent performance across various objective metrics used in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.