Improving Variational Encoder-Decoders in Dialogue Generation

Xiaoyu Shen; Hui Su; Shuzi Niu; Vera Demberg

arXiv:1802.02032·cs.CL·February 7, 2018·31 cites

Improving Variational Encoder-Decoders in Dialogue Generation

Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

PDF

Open Access

TL;DR

This paper proposes a two-phase training method for variational encoder-decoders in dialogue generation, improving the flexibility of latent variable distributions and achieving better performance.

Contribution

It introduces a novel two-phase training approach that separates autoencoding and latent space learning, enhancing model flexibility and performance.

Findings

01

Significant improvement in dialogue generation quality

02

Better metric-based evaluation scores

03

Enhanced human evaluation results

Abstract

Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns to autoencode discrete texts into continuous embeddings, from which the second phase learns to generalize latent representations by reconstructing the encoded embedding. In this case, latent variables are sampled by transforming Gaussian noise through multi-layer perceptrons and are trained with a separate VED model, which has the potential of realizing a much more flexible distribution. We compare our model with current popular models and the experiment demonstrates substantial improvement in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques