The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization
Samuele Cornell, Taejin Park, Steve Huang, Christoph, Boeddeker, Xuankai Chang, Matthew Maciejewski, Matthew Wiesner and, Paola Garcia, Shinji Watanabe

TL;DR
The CHiME-8 DASR challenge advances research in generalizable multi-channel distant speech recognition and diarization across diverse acoustic scenarios, incorporating new scenarios, LLM use, and practical tools to foster innovation.
Contribution
This paper introduces the CHiME-8 DASR challenge with new scenarios, LLM integration, a jury award, and baseline systems to promote robust and versatile speech recognition research.
Findings
Adding NOTSOFAR-1 scenario increases task difficulty
Baseline systems show performance challenges in complex scenarios
Toolkit and scoring tools lower entry barriers for participants
Abstract
This paper presents the CHiME-8 DASR challenge which carries on from the previous edition CHiME-7 DASR (C7DASR) and the past CHiME-6 challenge. It focuses on joint multi-channel distant speech recognition (DASR) and diarization with one or more, possibly heterogeneous, devices. The main goal is to spur research towards meeting transcription approaches that can generalize across arbitrary number of speakers, diverse settings (formal vs. informal conversations), meeting duration, wide-variety of acoustic scenarios and different recording configurations. Novelties with respect to C7DASR include: i) the addition of NOTSOFAR-1, an additional office/corporate meeting scenario, ii) a manually corrected Mixer 6 development set, iii) a new track in which we allow the use of large-language models (LLM) iv) a jury award mechanism to encourage participants to explore also more practical and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDilated Convolution · Pointwise Convolution · Hierarchical Feature Fusion · Convolution · Efficient Spatial Pyramid · Parameterized ReLU · Kaiming Initialization · 1x1 Convolution · ESPNet
