EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints
Weijia Xu, Marine Carpuat

TL;DR
EDITOR is a novel edit-based transformer model for neural machine translation that effectively incorporates soft lexical constraints, improving translation quality and decoding speed over previous models.
Contribution
Introduces a repositioning operation in an edit-based transformer, enabling flexible, efficient, and constraint-aware sequence generation for neural machine translation.
Findings
Outperforms Levenshtein Transformer in using soft lexical constraints
Speeds up decoding significantly compared to constrained beam search
Achieves comparable or better translation quality on multiple language pairs
Abstract
We introduce an Edit-Based Transformer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice. Building on recent models for non-autoregressive sequence generation (Gu et al., 2019), EDITOR generates new sequences by iteratively editing hypotheses. It relies on a novel reposition operation designed to disentangle lexical choice from word positioning decisions, while enabling efficient oracles for imitation learning and parallel edits at decoding time. Empirically, EDITOR uses soft lexical constraints more effectively than the Levenshtein Transformer (Gu et al., 2019) while speeding up decoding dramatically compared to constrained beam search (Post and Vilar, 2018). EDITOR also achieves comparable or better translation quality with faster decoding speed than the Levenshtein Transformer on standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dropout · Softmax · Residual Connection · Multi-Head Attention · Dense Connections · Label Smoothing
