Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management
Zhi Chen, Lu Chen, Xiaoyuan Liu, and Kai Yu

TL;DR
This paper proposes a distributed structured actor-critic reinforcement learning approach to improve dialogue management in task-oriented spoken dialogue systems, focusing on policy decision-making within a POMDP framework.
Contribution
It introduces a novel distributed structured actor-critic method tailored for dialogue policy optimization, advancing the application of DRL in dialogue systems.
Findings
Enhanced policy learning efficiency
Improved dialogue success rates
Better generalization in dialogue tasks
Abstract
The task-oriented spoken dialogue system (SDS) aims to assist a human user in accomplishing a specific task (e.g., hotel booking). The dialogue management is a core part of SDS. There are two main missions in dialogue management: dialogue belief state tracking (summarising conversation history) and dialogue decision-making (deciding how to reply to the user). In this work, we only focus on devising a policy that chooses which dialogue action to respond to the user. The sequential system decision-making process can be abstracted into a partially observable Markov decision process (POMDP). Under this framework, reinforcement learning approaches can be used for automated policy optimization. In the past few years, there are many deep reinforcement learning (DRL) algorithms, which use neural networks (NN) as function approximators, investigated for dialogue policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
