Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking
Glorianna Jagfeld, Ngoc Thang Vu

TL;DR
This paper introduces a recurrent neural network-based method to encode word confusion networks, improving dialog state tracking performance by leveraging richer ASR hypothesis information over traditional best-hypothesis encoding.
Contribution
The paper proposes a novel RNN-based encoding of word confusion networks for dialog state tracking, outperforming best hypothesis encoding methods.
Findings
Encoding confusion networks yields better dialog state tracking accuracy.
Our approach surpasses traditional best hypothesis encoding on the DSTC dataset.
Recurrent neural networks effectively capture rich ASR hypothesis information.
Abstract
This paper presents our novel method to encode word confusion networks, which can represent a rich hypothesis space of automatic speech recognition systems, via recurrent neural networks. We demonstrate the utility of our approach for the task of dialog state tracking in spoken dialog systems that relies on automatic speech recognition output. Encoding confusion networks outperforms encoding the best hypothesis of the automatic speech recognition in a neural system for dialog state tracking on the well-known second Dialog State Tracking Challenge dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
