Deep Reinforcement Learning for On-line Dialogue State Tracking

Zhi Chen; Lu Chen; Xiang Zhou; Kai Yu

arXiv:2009.10321·cs.CL·September 23, 2020

Deep Reinforcement Learning for On-line Dialogue State Tracking

Zhi Chen, Lu Chen, Xiang Zhou, Kai Yu

PDF

TL;DR

This paper introduces a novel deep reinforcement learning framework for on-line dialogue state tracking that improves dialogue management performance and allows joint optimization of dialogue policy.

Contribution

It is the first to apply DRL for on-line DST optimization and enables joint training of DST and dialogue policy.

Findings

01

On-line DST optimization improves dialogue manager performance.

02

Joint training of DST and policy yields further improvements.

03

Framework maintains flexibility of predefined policies.

Abstract

Dialogue state tracking (DST) is a crucial module in dialogue management. It is usually cast as a supervised training problem, which is not convenient for on-line optimization. In this paper, a novel companion teaching based deep reinforcement learning (DRL) framework for on-line DST optimization is proposed. To the best of our knowledge, this is the first effort to optimize the DST module within DRL framework for on-line task-oriented spoken dialogue systems. In addition, dialogue policy can be further jointly updated. Experiments show that on-line DST optimization can effectively improve the dialogue manager performance while keeping the flexibility of using predefined policy. Joint training of both DST and policy can further improve the performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDynamic Sparse Training