The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant   Automatic Speech Recognition and Diarization

Samuele Cornell; Taejin Park; Steve Huang; Christoph; Boeddeker; Xuankai Chang; Matthew Maciejewski; Matthew Wiesner and; Paola Garcia; Shinji Watanabe

arXiv:2407.16447·eess.AS·July 24, 2024

The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization

Samuele Cornell, Taejin Park, Steve Huang, Christoph, Boeddeker, Xuankai Chang, Matthew Maciejewski, Matthew Wiesner and, Paola Garcia, Shinji Watanabe

PDF

TL;DR

The CHiME-8 DASR challenge advances research in generalizable multi-channel distant speech recognition and diarization across diverse acoustic scenarios, incorporating new scenarios, LLM use, and practical tools to foster innovation.

Contribution

This paper introduces the CHiME-8 DASR challenge with new scenarios, LLM integration, a jury award, and baseline systems to promote robust and versatile speech recognition research.

Findings

01

Adding NOTSOFAR-1 scenario increases task difficulty

02

Baseline systems show performance challenges in complex scenarios

03

Toolkit and scoring tools lower entry barriers for participants

Abstract

This paper presents the CHiME-8 DASR challenge which carries on from the previous edition CHiME-7 DASR (C7DASR) and the past CHiME-6 challenge. It focuses on joint multi-channel distant speech recognition (DASR) and diarization with one or more, possibly heterogeneous, devices. The main goal is to spur research towards meeting transcription approaches that can generalize across arbitrary number of speakers, diverse settings (formal vs. informal conversations), meeting duration, wide-variety of acoustic scenarios and different recording configurations. Novelties with respect to C7DASR include: i) the addition of NOTSOFAR-1, an additional office/corporate meeting scenario, ii) a manually corrected Mixer 6 development set, iii) a new track in which we allow the use of large-language models (LLM) iv) a jury award mechanism to encourage participants to explore also more practical and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDilated Convolution · Pointwise Convolution · Hierarchical Feature Fusion · Convolution · Efficient Spatial Pyramid · Parameterized ReLU · Kaiming Initialization · 1x1 Convolution · ESPNet