Building End-To-End Dialogue Systems Using Generative Hierarchical   Neural Network Models

Iulian V. Serban; Alessandro Sordoni; Yoshua Bengio; Aaron Courville; and Joelle Pineau

arXiv:1507.04808·cs.CL·April 8, 2016

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Iulian V. Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau

PDF

5 Repos

TL;DR

This paper explores building open-domain dialogue systems with generative hierarchical neural networks, demonstrating their competitiveness and ways to improve performance through data and embeddings.

Contribution

It extends hierarchical neural encoder-decoder models to dialogue systems and shows how to enhance their performance with larger data and pretrained embeddings.

Findings

01

Hierarchical neural models are competitive with state-of-the-art methods.

02

Bootstrapping with larger datasets improves response quality.

03

Pretrained embeddings enhance model performance.

Abstract

We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.