A Robust Semantic Frame Parsing Pipeline on a New Complex Twitter Dataset
Yu Wang, Hongxia Jin

TL;DR
This paper presents a robust semantic frame parsing pipeline capable of handling out-of-distribution patterns and out-of-vocabulary tokens, tested on a new complex Twitter dataset with longer, more diverse utterances.
Contribution
The authors introduce a novel semantic frame parsing pipeline that improves robustness to OOD and OOV tokens, validated on a new complex Twitter dataset and existing benchmarks.
Findings
Outperforms state-of-the-art models on SNIPS and Twitter datasets
Demonstrates robustness to OOD patterns and OOV tokens
Validates effectiveness through an end-to-end application demo
Abstract
Most recent semantic frame parsing systems for spoken language understanding (SLU) are designed based on recurrent neural networks. These systems display decent performance on benchmark SLU datasets such as ATIS or SNIPS, which contain short utterances with relatively simple patterns. However, the current semantic frame parsing models lack a mechanism to handle out-of-distribution (\emph{OOD}) patterns and out-of-vocabulary (\emph{OOV}) tokens. In this paper, we introduce a robust semantic frame parsing pipeline that can handle both \emph{OOD} patterns and \emph{OOV} tokens in conjunction with a new complex Twitter dataset that contains long tweets with more \emph{OOD} patterns and \emph{OOV} tokens. The new pipeline demonstrates much better results in comparison to state-of-the-art baseline SLU models on both the SNIPS dataset and the new Twitter dataset (Our new Twitter dataset can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
