Workflow-Guided Response Generation for Task-Oriented Dialogue
Do June Min, Paloma Sodhi, Ramya Ramakrishnan

TL;DR
This paper introduces a reinforcement learning framework for task-oriented dialogue systems that ensures generated responses follow specified workflows, improving compliance and naturalness over previous supervised methods.
Contribution
It proposes a novel RL-based approach with a ComplianceScorer to explicitly optimize workflow compliance in dialogue response generation.
Findings
Outperforms baseline methods in compliance and naturalness
Effective on multiple datasets including ABCD and MultiWOZ 2.2
Enhances response quality with explicit workflow adherence
Abstract
Task-oriented dialogue (TOD) systems aim to achieve specific goals through interactive dialogue. Such tasks usually involve following specific workflows, i.e. executing a sequence of actions in a particular order. While prior work has focused on supervised learning methods to condition on past actions, they do not explicitly optimize for compliance to a desired workflow. In this paper, we propose a novel framework based on reinforcement learning (RL) to generate dialogue responses that are aligned with a given workflow. Our framework consists of ComplianceScorer, a metric designed to evaluate how well a generated response executes the specified action, combined with an RL opimization process that utilizes an interactive sampling technique. We evaluate our approach on two TOD datasets, Action-Based Conversations Dataset (ABCD) (Chen et al., 2021a) and MultiWOZ 2.2 (Zang et al., 2020) on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · AI in Service Interactions
