Budgeted Policy Learning for Task-Oriented Dialogue Systems

Zhirui Zhang; Xiujun Li; Jianfeng Gao; Enhong Chen

arXiv:1906.00499·cs.CL·June 4, 2019·19 cites

Budgeted Policy Learning for Task-Oriented Dialogue Systems

Zhirui Zhang, Xiujun Li, Jianfeng Gao, Enhong Chen

PDF

Open Access

TL;DR

This paper introduces a budget-aware learning method for task-oriented dialogue systems that optimally allocates limited user interactions to improve success rates.

Contribution

It extends Deep Dyna-Q with a Budget-Conscious Scheduling framework, including a global scheduler, experience controller, and user goal sampling, for efficient learning under fixed interaction budgets.

Findings

01

Significant success rate improvements over baselines.

02

Effective utilization of limited user interactions.

03

Robust performance on movie-ticket booking task.

Abstract

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Intelligent Tutoring Systems and Adaptive Learning