Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli, Celikyilmaz, Sungjin Lee, Kam-Fai Wong

TL;DR
This paper introduces a hierarchical deep reinforcement learning framework for training dialogue agents capable of managing complex, multi-step tasks like travel planning by operating at different temporal scales and subtasks.
Contribution
It formulates task completion as options over MDPs and develops a hierarchical dialogue policy with a top-level subtask selector and low-level primitive action policies, improving over existing methods.
Findings
Significant performance improvements over rule-based and flat deep RL baselines.
Effective handling of complex multi-subtask dialogues in travel planning.
Demonstrated success with both simulated and real user interactions.
Abstract
Building a dialogue agent to fulfill complex tasks, such as travel planning, is challenging because the agent has to learn to collectively complete multiple subtasks. For example, the agent needs to reserve a hotel and book a flight so that there leaves enough time for commute between arrival and hotel check-in. This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. The dialogue manager consists of: (1) a top-level dialogue policy that selects among subtasks or options, (2) a low-level dialogue policy that selects primitive actions to complete the subtask given by the top-level policy, and (3) a global state tracker that helps ensure all cross-subtask constraints…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Multi-Agent Systems and Negotiation
