DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio,, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

TL;DR
DeltaLM is a novel pretrained multilingual encoder-decoder model that bridges the gap between language understanding and generation tasks by augmenting existing encoders with a decoder and employing self-supervised pre-training.
Contribution
It introduces a unified encoder-decoder pretraining framework that leverages both monolingual and bilingual data for improved language generation and translation performance.
Findings
Outperforms strong baselines on translation and generation tasks
Effective use of span corruption and translation span corruption pretraining tasks
Achieves state-of-the-art results in multiple NLP benchmarks
Abstract
While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG). NLG tasks are often based on the encoder-decoder framework, where the pretrained encoders can only benefit part of it. To reduce this gap, we introduce DeltaLM, a pretrained multilingual encoder-decoder model that regards the decoder as the task layer of off-the-shelf pretrained encoders. Specifically, we augment the pretrained multilingual encoder with a decoder and pre-train it in a self-supervised way. To take advantage of both the large-scale monolingual data and bilingual data, we adopt the span corruption and translation span corruption as the pre-training tasks. Experiments show that DeltaLM outperforms various strong baselines on both natural language generation and translation tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
