End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator
Guangzhi Sun, Chao Zhang, Philip C. Woodland

TL;DR
This paper introduces TCPGen and SPB mechanisms to enhance end-to-end spoken language understanding by effectively biasing rare word recognition, leading to significant improvements in SLU-F1 scores and zero-shot learning capabilities.
Contribution
The paper proposes a novel tree-constrained pointer generator and slot probability biasing for improved biasing of rare words in end-to-end SLU systems, especially for unseen entities.
Findings
Consistent SLU-F1 improvements with TCPGen and SPB.
Achieved over 50% SLU-F1 in zero-shot learning scenario.
Improved intent classification accuracy.
Abstract
End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
