Semantics-aware BERT for Language Understanding

Zhuosheng Zhang; Yuwei Wu; Hai Zhao; Zuchao Li; Shuailiang Zhang; Xi; Zhou; Xiang Zhou

arXiv:1909.02209·cs.CL·February 5, 2020·25 cites

Semantics-aware BERT for Language Understanding

Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi, Zhou, Xiang Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces Semantics-aware BERT (SemBERT), which enhances language understanding by integrating explicit semantic information from semantic role labeling into the BERT model, leading to improved performance on multiple NLP tasks.

Contribution

The paper proposes a novel Semantics-aware BERT that incorporates structured semantic information, significantly improving upon standard BERT without complex modifications.

Findings

01

Achieves state-of-the-art results on ten NLP tasks.

02

Substantially improves performance over BERT.

03

Maintains ease of use with light fine-tuning.

Abstract

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cooelf/SemBERT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsLinear Layer · Cosine Annealing · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM · ELMo · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay