IEEE SLT 2021 Alpha-mini Speech Challenge: Open Datasets, Tracks, Rules   and Baselines

Yihui Fu; Zhuoyuan Yao; Weipeng He; Jian Wu; Xiong Wang; Zhanheng; Yang; Shimin Zhang; Lei Xie; Dongyan Huang; Hui Bu; Petr Motlicek; Jean-Marc; Odobez

arXiv:2011.02198·cs.SD·November 17, 2020

IEEE SLT 2021 Alpha-mini Speech Challenge: Open Datasets, Tracks, Rules and Baselines

Yihui Fu, Zhuoyuan Yao, Weipeng He, Jian Wu, Xiong Wang, Zhanheng, Yang, Shimin Zhang, Lei Xie, Dongyan Huang, Hui Bu, Petr Motlicek, Jean-Marc, Odobez

PDF

1 Repo

TL;DR

The paper introduces the IEEE SLT 2021 Alpha-mini Speech Challenge, providing open datasets, rules, and baselines to advance research in keyword spotting and sound source localization on humanoid robots, emphasizing real-world data and evaluation.

Contribution

It offers a new challenge with open datasets, real robot recordings, and standardized evaluation methods for deep learning models in speech and sound localization tasks.

Findings

01

Open source large-scale speech and noise datasets provided.

02

Recorded data from Alpha-mini robot includes real echo and noise conditions.

03

Baseline models established for benchmarking performance.

Abstract

The IEEE Spoken Language Technology Workshop (SLT) 2021 Alpha-mini Speech Challenge (ASC) is intended to improve research on keyword spotting (KWS) and sound source location (SSL) on humanoid robots. Many publications report significant improvements in deep learning based KWS and SSL on open source datasets in recent years. For deep learning model training, it is necessary to expand the data coverage to improve the robustness of model. Thus, simulating multi-channel noisy and reverberant data from single-channel speech, noise, echo and room impulsive response (RIR) is widely adopted. However, this approach may generate mismatch between simulated data and recorded data in real application scenarios, especially echo data. In this challenge, we open source a sizable speech, keyword, echo and noise corpus for promoting data-driven methods, particularly deep-learning approaches on KWS and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nwpuaslp/ASC_baseline
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.