End-to-end LSTM-based dialog control optimized with supervised and   reinforcement learning

Jason D. Williams; Geoffrey Zweig

arXiv:1606.01269·cs.CL·June 7, 2016·122 cites

End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning

Jason D. Williams, Geoffrey Zweig

PDF

Open Access

TL;DR

This paper introduces an end-to-end LSTM-based dialog system that learns from supervised data and reinforcement learning, reducing manual feature engineering and improving task-oriented dialog management.

Contribution

It presents a novel LSTM-based dialog control model that integrates supervised and reinforcement learning for efficient end-to-end training.

Findings

01

Supervised learning provides a good initial policy with few dialogs.

02

Reinforcement learning improves dialog policies through interaction.

03

Combining SL and RL accelerates learning and enhances performance.

Abstract

This paper presents a model for end-to-end learning of task-oriented dialog systems. The main component of the model is a recurrent neural network (an LSTM), which maps from raw dialog history directly to a distribution over system actions. The LSTM automatically infers a representation of dialog history, which relieves the system developer of much of the manual feature engineering of dialog state. In addition, the developer can provide software that expresses business rules and provides access to programmatic APIs, enabling the LSTM to take actions in the real world on behalf of the user. The LSTM can be optimized using supervised learning (SL), where a domain expert provides example dialogs which the LSTM should imitate; or using reinforcement learning (RL), where the system improves by interacting directly with end users. Experiments show that SL and RL are complementary: SL alone…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Topic Modeling

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory