Universal Post-Processing Networks for Joint Optimization of Modules in Task-Oriented Dialogue Systems
Atsumoto Ohashi, Ryuichiro Higashinaka

TL;DR
This paper introduces universal post-processing networks (UniPPNs) that jointly optimize all modules in task-oriented dialogue systems using reinforcement learning, leading to improved task completion performance.
Contribution
The study proposes a novel joint optimization method with language-model-based UniPPNs and a module-level RL algorithm for stabilizing learning across all modules.
Findings
UniPPN outperforms traditional PPNs in task completion.
Joint optimization improves overall dialogue system performance.
Experimental validation with MultiWOZ dataset confirms effectiveness.
Abstract
Post-processing networks (PPNs) are components that modify the outputs of arbitrary modules in task-oriented dialogue systems and are optimized using reinforcement learning (RL) to improve the overall task completion capability of the system. However, previous PPN-based approaches have been limited to handling only a subset of modules within a system, which poses a significant limitation in improving the system performance. In this study, we propose a joint optimization method for post-processing the outputs of all modules using universal post-processing networks (UniPPNs), which are language-model-based networks that can modify the outputs of arbitrary modules in a system as a sequence-transformation task. Moreover, our RL algorithm, which employs a module-level Markov decision process, enables fine-grained value and advantage estimation for each module, thereby stabilizing joint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Robotics and Automated Systems · Service-Oriented Architecture and Web Services
