TL;DR
The paper introduces the IEEE SLT 2021 Alpha-mini Speech Challenge, providing open datasets, rules, and baselines to advance research in keyword spotting and sound source localization on humanoid robots, emphasizing real-world data and evaluation.
Contribution
It offers a new challenge with open datasets, real robot recordings, and standardized evaluation methods for deep learning models in speech and sound localization tasks.
Findings
Open source large-scale speech and noise datasets provided.
Recorded data from Alpha-mini robot includes real echo and noise conditions.
Baseline models established for benchmarking performance.
Abstract
The IEEE Spoken Language Technology Workshop (SLT) 2021 Alpha-mini Speech Challenge (ASC) is intended to improve research on keyword spotting (KWS) and sound source location (SSL) on humanoid robots. Many publications report significant improvements in deep learning based KWS and SSL on open source datasets in recent years. For deep learning model training, it is necessary to expand the data coverage to improve the robustness of model. Thus, simulating multi-channel noisy and reverberant data from single-channel speech, noise, echo and room impulsive response (RIR) is widely adopted. However, this approach may generate mismatch between simulated data and recorded data in real application scenarios, especially echo data. In this challenge, we open source a sizable speech, keyword, echo and noise corpus for promoting data-driven methods, particularly deep-learning approaches on KWS and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
