Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog   Agents with Latent Variable Models

Tiancheng Zhao; Kaige Xie; Maxine Eskenazi

arXiv:1902.08858·cs.CL·April 16, 2019·22 cites

Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

Tiancheng Zhao, Kaige Xie, Maxine Eskenazi

PDF

Open Access 3 Repos

TL;DR

This paper introduces a novel latent action framework for end-to-end dialog agents, enabling unsupervised learning of action spaces that improve reinforcement learning performance over traditional methods.

Contribution

It proposes a new latent variable approach to define action spaces, moving beyond handcrafted dialog acts and output vocabularies, with comprehensive experiments demonstrating its effectiveness.

Findings

01

Latent actions outperform word-level policy gradient methods.

02

Both continuous and discrete action types are effective.

03

Unsupervised induction of action spaces is feasible and beneficial.

Abstract

Defining action spaces for conversational agents and optimizing their decision-making process with reinforcement learning is an enduring challenge. Common practice has been to use handcrafted dialog acts, or the output vocabulary, e.g. in neural encoder decoders, as the action spaces. Both have their own limitations. This paper proposes a novel latent action framework that treats the action spaces of an end-to-end dialog agent as latent variables and develops unsupervised methods in order to induce its own action space from the data. Comprehensive experiments are conducted examining both continuous and discrete action types and two different optimization methods based on stochastic variational inference. Results show that the proposed latent actions achieve superior empirical performance improvement over previous word-level policy gradient methods on both DealOrNoDeal and MultiWoz…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications