A Sample-Based Training Method for Distantly Supervised Relation   Extraction with Pre-Trained Transformers

Mehrdad Nasser; Mohamad Bagher Sajadi; Behrouz Minaei-Bidgoli

arXiv:2104.07512·cs.CL·April 16, 2021

A Sample-Based Training Method for Distantly Supervised Relation Extraction with Pre-Trained Transformers

Mehrdad Nasser, Mohamad Bagher Sajadi, Behrouz Minaei-Bidgoli

PDF

Open Access

TL;DR

This paper introduces a novel sampling method for distantly supervised relation extraction that reduces hardware requirements and improves performance by combining random sentence sampling with ensemble predictions, fine-tuned on BERT.

Contribution

The paper proposes a new sampling approach for DSRE that relaxes hardware constraints and enhances accuracy through ensemble modeling during fine-tuning of BERT.

Findings

01

Outperforms previous methods in AUC and P@N metrics

02

Reduces hardware requirements for MIL-based DSRE

03

Effective ensemble strategy improves relation extraction accuracy

Abstract

Multiple instance learning (MIL) has become the standard learning paradigm for distantly supervised relation extraction (DSRE). However, due to relation extraction being performed at bag level, MIL has significant hardware requirements for training when coupled with large sentence encoders such as deep transformer neural networks. In this paper, we propose a novel sampling method for DSRE that relaxes these hardware requirements. In the proposed method, we limit the number of sentences in a batch by randomly sampling sentences from the bags in the batch. However, this comes at the cost of losing valid sentences from bags. To alleviate the issues caused by random sampling, we use an ensemble of trained models for prediction. We demonstrate the effectiveness of our approach by using our proposed learning setting to fine-tuning BERT on the widely NYT dataset. Our approach significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Softmax · Linear Warmup With Linear Decay · Weight Decay · WordPiece · Dropout