Challenges and Opportunities in Multi-device Speech Processing

Gregory Ciccarelli; Jarred Barber; Arun Nair; Israel Cohen; Tao Zhang

arXiv:2206.15432·eess.AS·July 1, 2022·1 cites

Challenges and Opportunities in Multi-device Speech Processing

Gregory Ciccarelli, Jarred Barber, Arun Nair, Israel Cohen, Tao Zhang

PDF

Open Access

TL;DR

This paper reviews the current state, challenges, and future prospects of multi-device speech processing technologies, emphasizing the need for new datasets and solutions in smart home environments.

Contribution

It provides a comprehensive overview of technical challenges, existing solutions, and future directions in multi-device speech processing, highlighting gaps and opportunities for research.

Findings

01

Identifies key technical challenges in multi-device speech recognition

02

Highlights the need for specialized datasets for multi-device scenarios

03

Provides an outlook on future research directions in the field

Abstract

We review current solutions and technical challenges for automatic speech recognition, keyword spotting, device arbitration, speech enhancement, and source localization in multidevice home environments to provide context for the INTERSPEECH 2022 special session, "Challenges and opportunities for signal processing and machine learning for multiple smart devices". We also identify the datasets needed to support these research areas. Based on the review and our research experience in the multi-device domain, we conclude with an outlook on the future evolution

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing