Generating Text with Deep Reinforcement Learning

Hongyu Guo

arXiv:1510.09202·cs.CL·November 2, 2015·41 cites

Generating Text with Deep Reinforcement Learning

Hongyu Guo

PDF

Open Access

TL;DR

This paper presents a novel sequence-to-sequence decoding method using deep reinforcement learning with a Deep Q-Network, which iteratively improves generated sequences by focusing on difficult parts, outperforming traditional decoders on unseen data.

Contribution

Introduces a reinforcement learning-based decoding schema for sequence generation that iteratively refines output sequences, emphasizing difficult parts, and demonstrates improved performance on unseen sentences.

Findings

01

Outperforms baseline on unseen sentences in BLEU score

02

Effectively focuses on difficult sequence parts during decoding

03

Achieves competitive results on training data

Abstract

We introduce a novel schema for sequence to sequence learning with a Deep Q-Network (DQN), which decodes the output sequence iteratively. The aim here is to enable the decoder to first tackle easier portions of the sequences, and then turn to cope with difficult parts. Specifically, in each iteration, an encoder-decoder Long Short-Term Memory (LSTM) network is employed to, from the input sequence, automatically create features to represent the internal states of and formulate a list of potential actions for the DQN. Take rephrasing a natural sentence as an example. This list can contain ranked potential words. Next, the DQN learns to make decision on which action (e.g., word) will be selected from the list to modify the current decoded sequence. The newly modified output sequence is subsequently used as the input to the DQN for the next decoding iteration. In each iteration, we also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsSigmoid Activation · Tanh Activation · Q-Learning · Dense Connections · Convolution · Deep Q-Network · Long Short-Term Memory