Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep   Reinforcement Learning

Baolin Peng; Xiujun Li; Lihong Li; Jianfeng Gao; Asli; Celikyilmaz; Sungjin Lee; Kam-Fai Wong

arXiv:1704.03084·cs.CL·July 25, 2017·36 cites

Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning

Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli, Celikyilmaz, Sungjin Lee, Kam-Fai Wong

PDF

Open Access

TL;DR

This paper introduces a hierarchical deep reinforcement learning framework for training dialogue agents capable of managing complex, multi-step tasks like travel planning by operating at different temporal scales and subtasks.

Contribution

It formulates task completion as options over MDPs and develops a hierarchical dialogue policy with a top-level subtask selector and low-level primitive action policies, improving over existing methods.

Findings

01

Significant performance improvements over rule-based and flat deep RL baselines.

02

Effective handling of complex multi-subtask dialogues in travel planning.

03

Demonstrated success with both simulated and real user interactions.

Abstract

Building a dialogue agent to fulfill complex tasks, such as travel planning, is challenging because the agent has to learn to collectively complete multiple subtasks. For example, the agent needs to reserve a hotel and book a flight so that there leaves enough time for commute between arrival and hotel check-in. This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. The dialogue manager consists of: (1) a top-level dialogue policy that selects among subtasks or options, (2) a low-level dialogue policy that selects primitive actions to complete the subtask given by the top-level policy, and (3) a global state tracker that helps ensure all cross-subtask constraints…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Multi-Agent Systems and Negotiation