ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi, Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc, Thang Vu, Alan W Black, Shinji Watanabe

TL;DR
ESPnet-SLU is an open-source toolkit that facilitates rapid development and benchmarking of spoken language understanding systems by integrating ASR and NLU models within a unified framework, supporting state-of-the-art performance.
Contribution
The paper introduces ESPnet-SLU, a comprehensive open-source toolkit that streamlines SLU research by providing prebuilt models, benchmarks, and easy integration with existing speech processing tasks.
Findings
Pretrained models match or outperform state-of-the-art results.
Supports multiple SLU benchmarks with flexible model integration.
Enables reproducible and rapid SLU research.
Abstract
As Automatic Speech Processing (ASR) systems are getting better, there is an increasing interest of using the ASR output to do downstream Natural Language Processing (NLP) tasks. However, there are few open source toolkits that can be used to generate reproducible results on different Spoken Language Understanding (SLU) benchmarks. Hence, there is a need to build an open source standard that can be used to have a faster start into SLU research. We present ESPnet-SLU, which is designed for quick development of spoken language understanding in a single framework. ESPnet-SLU is a project inside end-to-end speech processing toolkit, ESPnet, which is a widely used open-source standard for various speech processing tasks like ASR, Text to Speech (TTS) and Speech Translation (ST). We enhance the toolkit to provide implementations for various SLU benchmarks that enable researchers to seamlessly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Topic Modeling
