CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for   Unsegmented Recordings

Shinji Watanabe; Michael Mandel; Jon Barker; Emmanuel Vincent; Ashish; Arora; Xuankai Chang; Sanjeev Khudanpur; Vimal Manohar; Daniel Povey; Desh; Raj; David Snyder; Aswin Shanmugam Subramanian; Jan Trmal; Bar Ben Yair,; Christoph Boeddeker; Zhaoheng Ni; Yusuke Fujita; Shota Horiguchi; Naoyuki; Kanda; Takuya Yoshioka; Neville Ryant

arXiv:2004.09249·cs.SD·May 5, 2020·97 cites

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish, Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh, Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair,, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita

PDF

Open Access

TL;DR

The paper introduces the CHiME-6 challenge focused on multispeaker speech recognition in natural home environments, emphasizing unsegmented recordings and providing open-source baselines for the community.

Contribution

It presents a new challenge for unsegmented multispeaker speech recognition with comprehensive baselines, advancing research in natural conversational speech processing.

Findings

01

First challenge to address unsegmented multispeaker recognition

02

Provides reproducible open-source baselines

03

Focuses on natural home environment recordings

Abstract

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge revisits the previous CHiME-5 challenge and further considers the problem of distant multi-microphone conversational speech diarization and recognition in everyday home environments. Speech material is the same as the previous CHiME-5 recordings except for accurate array synchronization. The material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech. This paper provides a baseline description of the CHiME-6 challenge for both segmented multispeaker speech recognition (Track 1) and unsegmented multispeaker speech recognition (Track 2). Of note, Track 2 is the first challenge activity in the community to tackle an unsegmented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing