Sequence Generation: From Both Sides to the Middle

Long Zhou; Jiajun Zhang; Chengqing Zong; Heng Yu

arXiv:1906.09601·cs.CL·June 25, 2019·1 cites

Sequence Generation: From Both Sides to the Middle

Long Zhou, Jiajun Zhang, Chengqing Zong, Heng Yu

PDF

Open Access

TL;DR

This paper introduces a bidirectional sequence generation model that predicts outputs from both ends towards the middle, significantly speeding up decoding and enhancing quality in tasks like translation and summarization.

Contribution

The paper proposes a novel SBSG model that enables simultaneous bidirectional decoding with interactive attention, addressing speed and context guidance issues in sequence generation.

Findings

01

Speeds up decoding compared to autoregressive models.

02

Improves generation quality in translation and summarization.

03

Effective bidirectional interaction enhances output coherence.

Abstract

The encoder-decoder framework has achieved promising process for many sequence generation tasks, such as neural machine translation and text summarization. Such a framework usually generates a sequence token by token from left to right, hence (1) this autoregressive decoding procedure is time-consuming when the output sentence becomes longer, and (2) it lacks the guidance of future context which is crucial to avoid under translation. To alleviate these issues, we propose a synchronous bidirectional sequence generation (SBSG) model which predicts its outputs from both sides to the middle simultaneously. In the SBSG model, we enable the left-to-right (L2R) and right-to-left (R2L) generation to help and interact with each other by leveraging interactive bidirectional attention network. Experiments on neural machine translation (En-De, Ch-En, and En-Ro) and text summarization tasks show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Multi-Head Attention · Byte Pair Encoding · Dense Connections