Deep RL with Hierarchical Action Exploration for Dialogue Generation

Itsugun Cho; Ryota Takahashi; Yusaku Yanase; Hiroaki Saito

arXiv:2303.13465·cs.CL·May 16, 2023·1 cites

Deep RL with Hierarchical Action Exploration for Dialogue Generation

Itsugun Cho, Ryota Takahashi, Yusaku Yanase, Hiroaki Saito

PDF

Open Access

TL;DR

This paper introduces a hierarchical action exploration method in deep reinforcement learning for dialogue generation, improving efficiency and response quality by using a dual-granularity Q-function and offline RL.

Contribution

It proposes a novel hierarchical exploration strategy with a dual-granularity Q-function and applies offline RL with multiple reward functions for better dialogue responses.

Findings

01

Outperforms baseline models on automatic metrics

02

Generates responses with higher expected rewards

03

Demonstrates improved explainability and controllability

Abstract

Traditionally, approximate dynamic programming is employed in dialogue generation with greedy policy improvement through action sampling, as the natural language action space is vast. However, this practice is inefficient for reinforcement learning (RL) due to the sparsity of eligible responses with high action values, which leads to weak improvement sustained by random sampling. This paper presents theoretical analysis and experiments that reveal the performance of the dialogue policy is positively correlated with the sampling size. To overcome this limitation, we introduce a novel dual-granularity Q-function that explores the most promising response category to intervene in the sampling process. Our approach extracts actions based on a grained hierarchy, thereby achieving the optimum with fewer policy iterations. Additionally, we use offline RL and learn from multiple reward functions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · AI in Service Interactions