The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
Jon Barker, Shinji Watanabe (CLSP), Emmanuel Vincent (MULTISPEECH),, Jan Trmal (CLSP)

TL;DR
The 5th CHiME Challenge advances robust speech recognition by providing a new dataset and benchmarks for distant multi-microphone ASR in realistic home environments, promoting research in signal processing and machine learning.
Contribution
This paper introduces the 5th CHiME Challenge dataset, task, and baselines, focusing on distant multi-microphone conversational ASR in natural home settings, with detailed data collection and evaluation procedures.
Findings
New dataset with real home environment recordings
Baseline systems for array synchronization and speech enhancement
Performance benchmarks for robustness in distant-microphone ASR
Abstract
The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multi-microphone conversational ASR in real home environments. Speech material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech and recorded by 6 Kinect microphone arrays and 4 binaural microphone pairs. The challenge features a single-array track and a multiple-array track and, for each track, distinct rankings will be produced for systems focusing on robustness with respect to distant-microphone capture vs. systems attempting to address all aspects of the task including conversational language modeling. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
