TL;DR
This paper introduces a hybrid model combining GPT-2 with Graph Attention Networks to improve dialogue state tracking by capturing inter-slot relationships across multiple domains, leading to better performance.
Contribution
The paper proposes a novel architecture that integrates graph-based representations with GPT-2 for enhanced dialogue state tracking, especially across multiple domains.
Findings
Improved state tracking performance on MultiWOZ 2.0
Graph modules effectively capture inter-slot dependencies
Enhanced prediction accuracy for cross-domain slot values
Abstract
Dialogue State Tracking is central to multi-domain task-oriented dialogue systems, responsible for extracting information from user utterances. We present a novel hybrid architecture that augments GPT-2 with representations derived from Graph Attention Networks in such a way to allow causal, sequential prediction of slot values. The model architecture captures inter-slot relationships and dependencies across domains that otherwise can be lost in sequential prediction. We report improvements in state tracking performance in MultiWOZ 2.0 against a strong GPT-2 baseline and investigate a simplified sparse training scenario in which DST models are trained only on session-level annotations but evaluated at the turn level. We further report detailed analyses to demonstrate the effectiveness of graph models in DST by showing that the proposed graph modules capture inter-slot dependencies and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDynamic Sparse Training · Linear Layer · Dropout · Multi-Head Attention · Attention Dropout · Dense Connections · Attention Is All You Need · Discriminative Fine-Tuning · Refunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing
