Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task   Dialogues

Thibault Cordier; Tanguy Urvoy; Fabrice Lefevre; Lina M.; Rojas-Barahona

arXiv:2302.11199·cs.CL·February 23, 2023

Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task Dialogues

Thibault Cordier, Tanguy Urvoy, Fabrice Lefevre, Lina M., Rojas-Barahona

PDF

Open Access

TL;DR

This paper introduces structured policies, especially graph neural networks, to enhance sample efficiency in multi-domain, multi-task dialogue learning, demonstrating high success rates with limited data from both simulated and human experts.

Contribution

The paper proposes the use of structured policies like GNNs for more sample-efficient dialogue management in complex environments, highlighting their effectiveness over unstructured approaches.

Findings

01

GNNs achieve over 80% success with only 50 dialogues from simulated experts.

02

GNNs outperform other models in sample efficiency for multi-domain dialogues.

03

Performance drops when learning from human data, indicating variability challenges.

Abstract

Reinforcement learning has been widely adopted to model dialogue managers in task-oriented dialogues. However, the user simulator provided by state-of-the-art dialogue frameworks are only rough approximations of human behaviour. The ability to learn from a small number of human interactions is hence crucial, especially on multi-domain and multi-task environments where the action space is large. We therefore propose to use structured policies to improve sample efficiency when learning on these kinds of environments. We also evaluate the impact of learning from human vs simulated experts. Among the different levels of structure that we tested, the graph neural networks (GNNs) show a remarkable superiority by reaching a success rate above 80% with only 50 dialogues, when learning from simulated experts. They also show superiority when learning from human experts, although a performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Intelligent Tutoring Systems and Adaptive Learning