EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start
Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn

TL;DR
EdiT5 is a semi-autoregressive text-editing model that combines non-autoregressive tagging and reordering with autoregressive insertion, achieving faster inference and competitive performance on various NLG tasks.
Contribution
It introduces a novel semi-autoregressive approach that decomposes text editing into tagging, reordering, and insertion, improving speed and flexibility over traditional seq2seq models.
Findings
Up to 25x speedup compared to seq2seq models.
Comparable performance to T5 in high-resource settings.
Outperforms T5 in low-resource scenarios.
Abstract
We present EdiT5 - a novel semi-autoregressive text-editing model designed to combine the strengths of non-autoregressive text-editing and autoregressive decoding. EdiT5 is faster during inference than conventional sequence-to-sequence (seq2seq) models, while being capable of modelling flexible input-output transformations. This is achieved by decomposing the generation process into three sub-tasks: (1) tagging to decide on the subset of input tokens to be preserved in the output, (2) re-ordering to define their order in the output text, and (3) insertion to infill the missing tokens that are not present in the input. The tagging and re-ordering steps, which are responsible for generating the largest portion of the output, are non-autoregressive, while the insertion step uses an autoregressive decoder. Depending on the task, EdiT5 on average requires significantly fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dense Connections · Dropout · Inverse Square Root Schedule · SentencePiece · Adafactor · Sigmoid Activation
