The CHiME-7 UDASE task: Unsupervised domain adaptation for   conversational speech enhancement

Simon Leglaive; L\'eonie Borne; Efthymios Tzinis; Mostafa Sadeghi,; Matthieu Fraticelli; Scott Wisdom; Manuel Pariente; Daniel Pressnitzer; John; R. Hershey

arXiv:2307.03533·cs.SD·January 9, 2024

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement

Simon Leglaive, L\'eonie Borne, Efthymios Tzinis, Mostafa Sadeghi,, Matthieu Fraticelli, Scott Wisdom, Manuel Pariente, Daniel Pressnitzer, John, R. Hershey

PDF

Open Access

TL;DR

This paper presents the CHiME-7 UDASE task, focusing on unsupervised domain adaptation for conversational speech enhancement to improve performance on real-world noisy, reverberant multi-speaker recordings without clean references.

Contribution

It introduces a new unsupervised domain adaptation task for speech enhancement using real-world conversational data, addressing domain mismatch issues in supervised models.

Findings

01

Baseline system performance established

02

Challenges of real-world noisy speech addressed

03

Framework for future domain adaptation research provided

Abstract

Supervised speech enhancement models are trained using artificially generated mixtures of clean speech and noise signals, which may not match real-world recording conditions at test time. This mismatch can lead to poor performance if the test domain significantly differs from the synthetic training domain. This paper introduces the unsupervised domain adaptation for conversational speech enhancement (UDASE) task of the 7th CHiME challenge. This task aims to leverage real-world noisy speech recordings from the target domain for unsupervised domain adaptation of speech enhancement models. The target domain corresponds to the multi-speaker reverberant conversational speech recordings of the CHiME-5 dataset, for which the ground-truth clean speech reference is unavailable. Given a CHiME-5 recording, the task is to estimate the clean, potentially multi-speaker, reverberant speech, removing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis