Sequence Generation: From Both Sides to the Middle
Long Zhou, Jiajun Zhang, Chengqing Zong, Heng Yu

TL;DR
This paper introduces a bidirectional sequence generation model that predicts outputs from both ends towards the middle, significantly speeding up decoding and enhancing quality in tasks like translation and summarization.
Contribution
The paper proposes a novel SBSG model that enables simultaneous bidirectional decoding with interactive attention, addressing speed and context guidance issues in sequence generation.
Findings
Speeds up decoding compared to autoregressive models.
Improves generation quality in translation and summarization.
Effective bidirectional interaction enhances output coherence.
Abstract
The encoder-decoder framework has achieved promising process for many sequence generation tasks, such as neural machine translation and text summarization. Such a framework usually generates a sequence token by token from left to right, hence (1) this autoregressive decoding procedure is time-consuming when the output sentence becomes longer, and (2) it lacks the guidance of future context which is crucial to avoid under translation. To alleviate these issues, we propose a synchronous bidirectional sequence generation (SBSG) model which predicts its outputs from both sides to the middle simultaneously. In the SBSG model, we enable the left-to-right (L2R) and right-to-left (R2L) generation to help and interact with each other by leveraging interactive bidirectional attention network. Experiments on neural machine translation (En-De, Ch-En, and En-Ro) and text summarization tasks show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Multi-Head Attention · Byte Pair Encoding · Dense Connections
