Dependency Parsing for Spoken Dialog Systems
Sam Davidson, Dian Yu, Zhou Yu

TL;DR
This paper introduces SCUD, an annotation scheme for dependency parsing in spoken dialog systems, and presents ConvBank, a new dataset, demonstrating improved parsing accuracy through pre-training and fine-tuning.
Contribution
The paper proposes SCUD, a new annotation scheme for spoken dialog dependency parsing, and provides ConvBank, a dataset for training and evaluating dialog parsers.
Findings
Pre-training on large datasets improves parsing accuracy.
Fine-tuning on ConvBank yields 85.05% unlabeled attachment accuracy.
The approach outperforms models trained on non-dialog datasets.
Abstract
Dependency parsing of conversational input can play an important role in language understanding for dialog systems by identifying the relationships between entities extracted from user utterances. Additionally, effective dependency parsing can elucidate differences in language structure and usage for discourse analysis of human-human versus human-machine dialogs. However, models trained on datasets based on news articles and web data do not perform well on spoken human-machine dialog, and currently available annotation schemes do not adapt well to dialog data. Therefore, we propose the Spoken Conversation Universal Dependencies (SCUD) annotation scheme that extends the Universal Dependencies (UD) (Nivre et al., 2016) guidelines to spoken human-machine dialogs. We also provide ConvBank, a conversation dataset between humans and an open-domain conversational dialog system with SCUD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
