Sequence Generation with Guider Network

Ruiyi Zhang; Changyou Chen; Zhe Gan; Wenlin Wang; Liqun Chen; Dinghan; Shen; Guoyin Wang; Lawrence Carin

arXiv:1811.00696·cs.CL·November 5, 2018·5 cites

Sequence Generation with Guider Network

Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan, Shen, Guoyin Wang, Lawrence Carin

PDF

Open Access

TL;DR

This paper introduces a guider network for sequence generation that provides intermediate rewards, addressing the sparse-reward problem in reinforcement learning and improving sequence quality.

Contribution

The paper proposes a novel guider network that models the sequence-generation environment and supplies intermediate rewards, enhancing RL-based sequence generation.

Findings

01

Improved sequence quality in unconditional tasks

02

Enhanced performance in conditional sequence generation

03

Effective handling of sparse-reward problem

Abstract

Sequence generation with reinforcement learning (RL) has received significant attention recently. However, a challenge with such methods is the sparse-reward problem in the RL training process, in which a scalar guiding signal is often only available after an entire sequence has been generated. This type of sparse reward tends to ignore the global structural information of a sequence, causing generation of sequences that are semantically inconsistent. In this paper, we present a model-based RL approach to overcome this issue. Specifically, we propose a novel guider network to model the sequence-generation environment, which can assist next-word prediction and provide intermediate rewards for generator optimization. Extensive experiments show that the proposed method leads to improved performance for both unconditional and conditional sequence-generation tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning