Recent Advances in End-to-End Spoken Language Understanding
Natalia Tomashenko, Antoine Caubriere, Yannick Esteve, Antoine, Laurent, Emmanuel Morin

TL;DR
This paper reviews recent progress in end-to-end spoken language understanding, focusing on neural models for extracting semantic info directly from speech for NER and slot filling tasks, using techniques like speaker adaptation and pretraining.
Contribution
It introduces novel methods and training strategies to enhance end-to-end SLU performance for NER and slot filling tasks.
Findings
Improved accuracy with speaker adaptation techniques
Enhanced model performance through modified CTC training
Effective use of sequential pretraining for SLU tasks
Abstract
This work investigates spoken language understanding (SLU) systems in the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model. Two SLU tasks are considered: named entity recognition (NER) and semantic slot filling (SF). For these tasks, in order to improve the model performance, we explore various techniques including speaker adaptation, a modification of the connectionist temporal classification (CTC) training criterion, and sequential pretraining.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
