Sequence-to-Sequence Learning as Beam-Search Optimization
Sam Wiseman, Alexander M. Rush

TL;DR
This paper introduces a structured beam-search training scheme for seq2seq models that learns global sequence scores, improving performance on various NLP tasks by unifying training and testing objectives.
Contribution
It extends seq2seq models to incorporate global sequence scoring through a novel beam-search training method, addressing local bias issues.
Findings
Outperforms attention-based seq2seq systems on multiple tasks
Achieves better sequence modeling by learning global scores
Unifies training loss with test-time usage
Abstract
Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-purpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores. This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highly-optimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
