LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue   Policy Optimization

Nurul Lubis; Christian Geishauser; Michael Heck; Hsien-chin Lin; Marco; Moresi; Carel van Niekerk; Milica Ga\v{s}i\'c

arXiv:2011.09378·cs.CL·November 19, 2020

LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization

Nurul Lubis, Christian Geishauser, Michael Heck, Hsien-chin Lin, Marco, Moresi, Carel van Niekerk, Milica Ga\v{s}i\'c

PDF

1 Repo

TL;DR

This paper introduces LAVA, a method that uses variational auto-encoding to create a meaningful latent action space for dialogue policy optimization, improving reinforcement learning efficiency and success rates in task-oriented dialogue systems.

Contribution

The paper proposes leveraging auxiliary response auto-encoding tasks to shape latent action spaces, enabling more effective end-to-end dialogue policy training with state-of-the-art results.

Findings

01

Latent action spaces improve RL training in dialogue systems.

02

Auxiliary auto-encoding enhances the interpretability of latent representations.

03

Achieves state-of-the-art success rates in dialogue policy optimization.

Abstract

Reinforcement learning (RL) can enable task-oriented dialogue systems to steer the conversation towards successful task completion. In an end-to-end setting, a response can be constructed in a word-level sequential decision making process with the entire system vocabulary as action space. Policies trained in such a fashion do not require expert-defined action spaces, but they have to deal with large action spaces and long trajectories, making RL impractical. Using the latent space of a variational model as action space alleviates this problem. However, current approaches use an uninformed prior for training and optimize the latent distribution solely on the context. It is therefore unclear whether the latent representation truly encodes the characteristics of different actions. In this paper, we explore three ways of leveraging an auxiliary task to shape the latent variable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

budzianowski/multiwoz
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.