Syntax-Enhanced Pre-trained Model

Zenan Xu; Daya Guo; Duyu Tang; Qinliang Su; Linjun Shou; Ming Gong,; Wanjun Zhong; Xiaojun Quan; Nan Duan; Daxin Jiang

arXiv:2012.14116·cs.CL·June 1, 2021·1 cites

Syntax-Enhanced Pre-trained Model

Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong,, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a syntax-aware Transformer model that incorporates syntactic dependency information during both pre-training and fine-tuning, leading to improved performance on various NLP tasks without relying on human-annotated syntax.

Contribution

The paper proposes a novel syntax-aware attention mechanism and a pre-training task for syntactic distance prediction, enabling the use of automatic syntax in pre-trained models.

Findings

01

Automatic syntactic information improves model performance.

02

Global syntactic distances outperform local head relations.

03

Achieves state-of-the-art results on multiple NLP benchmarks.

Abstract

We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. Such a problem would lead to the necessity of having human-annotated syntactic information, which limits the application of existing methods to broader scenarios. To address this, we present a model that utilizes the syntax of text in both pre-training and fine-tuning stages. Our model is based on Transformer with a syntax-aware attention layer that considers the dependency tree of the text. We further introduce a new pre-training task of predicting the syntactic distance among tokens in the dependency tree. We evaluate the model on three downstream tasks, including relation classification, entity typing,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Hi-ZenanXu/Syntax-Enhanced_Pre-trained_Model
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Attention Is All You Need · Dropout · Adam · Multi-Head Attention · WordPiece · Residual Connection · Byte Pair Encoding