Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Dhawal Gupta; Yinlam Chow; Aza Tulepbergenov; Mohammad Ghavamzadeh,; Craig Boutilier

arXiv:2302.10850·cs.LG·October 31, 2023·1 cites

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Dhawal Gupta, Yinlam Chow, Aza Tulepbergenov, Mohammad Ghavamzadeh,, Craig Boutilier

PDF

Open Access 1 Video

TL;DR

This paper introduces RL algorithms tailored for dialogue management that leverage Mixture-of-Expert Language Models to reduce action space complexity and enhance multi-turn conversational effectiveness.

Contribution

It develops novel RL methods that utilize MoE-LMs to improve dialogue management by addressing large action spaces and increasing response diversity.

Findings

01

Enhanced dialogue diversity and intent coverage

02

Improved RL-based dialogue management performance

03

Effective handling of large action spaces

Abstract

Reinforcement learning (RL) has shown great promise for developing dialogue management (DM) agents that are non-myopic, conduct rich conversations, and maximize overall user satisfaction. Despite recent developments in RL and language models (LMs), using RL to power conversational chatbots remains challenging, in part because RL requires online exploration to learn effectively, whereas collecting novel human-bot interactions can be expensive and unsafe. This issue is exacerbated by the combinatorial action spaces facing these algorithms, as most LM agents generate responses at the word level. We develop a variety of RL algorithms, specialized to dialogue planning, that leverage recent Mixture-of-Expert Language Models (MoE-LMs) -- models that capture diverse semantics, generate utterances reflecting different intents, and are amenable for multi-turn DM. By exploiting MoE-LM structure,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management· slideslive

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques