E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language   Understanding and Generation

Qihuang Zhong; Liang Ding; Juhua Liu; Bo Du; Dacheng Tao

arXiv:2205.14912·cs.CL·January 10, 2024

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

E2S2 introduces an encoding-focused self-supervised pretraining strategy for seq2seq models, enhancing their language understanding and generation capabilities by improving encoder representations.

Contribution

The paper proposes a novel encoding-enhanced pretraining method, E2S2, which integrates denoising and contrastive objectives into the encoder to improve seq2seq model performance.

Findings

01

Achieves +1.1% on GLUE benchmark

02

Improves F0.5 score by 1.75% on CoNLL2014

03

Enhances linguistic representations in seq2seq models

Abstract

Sequence-to-sequence (seq2seq) learning is a popular fashion for large-scale pretraining language models. However, the prior seq2seq pretraining models generally focus on reconstructive objectives on the decoder side and neglect the effect of encoder-side supervision, which we argue may lead to sub-optimal performance. To verify our hypothesis, we first empirically study the functionalities of the encoder and decoder in seq2seq pretrained language models, and find that the encoder takes an important but under-exploitation role than the decoder regarding the downstream performance and neuron activation. Therefore, we propose an encoding-enhanced seq2seq pretraining strategy, namely E2S2, which improves the seq2seq models via integrating more efficient self-supervised information into the encoders. Specifically, E2S2 adopts two self-supervised objectives on the encoder side from two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

whu-zqh/e2s2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Inverse Square Root Schedule · Gated Linear Unit · Adafactor · Attention Dropout · SentencePiece · T5 · Linear Layer