Roof-Transformer: Divided and Joined Understanding with Knowledge   Enhancement

Wei-Lin Liao; Cheng-En Su; Wei-Yun Ma

arXiv:2112.06736·cs.CL·October 21, 2022

Roof-Transformer: Divided and Joined Understanding with Knowledge Enhancement

Wei-Lin Liao, Cheng-En Su, Wei-Yun Ma

PDF

Open Access

TL;DR

Roof-Transformer enhances knowledge integration in NLP by using dual BERT encoders for knowledge and input, improving performance on tasks involving long texts like QA and GLUE benchmarks.

Contribution

Introduces a dual BERT architecture with a fusion layer to better incorporate knowledge resources in long-text NLP tasks.

Findings

01

Improved accuracy on QA tasks.

02

Enhanced performance on GLUE benchmark.

03

Effective knowledge integration in long documents.

Abstract

Recent work on enhancing BERT-based language representation models with knowledge graphs (KGs) and knowledge bases (KBs) has yielded promising results on multiple NLP tasks. State-of-the-art approaches typically integrate the original input sentences with KG triples and feed the combined representation into a BERT model. However, as the sequence length of a BERT model is limited, such a framework supports little knowledge other than the original input sentences and is thus forced to discard some knowledge. This problem is especially severe for downstream tasks for which the input is a long paragraph or even a document, such as QA or reading comprehension tasks. We address this problem with Roof-Transformer, a model with two underlying BERTs and a fusion layer on top. One underlying BERT encodes the knowledge resources and the other one encodes the original input sentences, and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Layer Normalization