Bootstrapping Relation Extractors using Syntactic Search by Examples

Matan Eyal; Asaf Amrami; Hillel Taub-Tabib; Yoav Goldberg

arXiv:2102.05007·cs.CL·February 10, 2021

Bootstrapping Relation Extractors using Syntactic Search by Examples

Matan Eyal, Asaf Amrami, Hillel Taub-Tabib, Yoav Goldberg

PDF

1 Repo

TL;DR

This paper introduces a quick, user-friendly method for bootstrapping relation extraction datasets using syntactic search engines, enabling non-experts to create effective models without extensive manual annotation.

Contribution

The paper presents a novel approach combining syntactic search with NLG data augmentation to efficiently generate training data for relation extractors, reducing reliance on manual labeling.

Findings

01

Models trained with syntactic search data are competitive with manually annotated models.

02

The combined approach outperforms NLG data augmentation alone.

03

The method enables non-experts to quickly create effective relation extraction datasets.

Abstract

The advent of neural-networks in NLP brought with it substantial improvements in supervised relation extraction. However, obtaining a sufficient quantity of training data remains a key challenge. In this work we propose a process for bootstrapping training datasets which can be performed quickly by non-NLP-experts. We take advantage of search engines over syntactic-graphs (Such as Shlain et al. (2020)) which expose a friendly by-example syntax. We use these to obtain positive examples by searching for sentences that are syntactically similar to user input examples. We apply this technique to relations from TACRED and DocRED and show that the resulting models are competitive with models trained on manually annotated data and on data obtained from distant supervision. The models also outperform models trained using NLG data augmentation techniques. Extending the search-based approach with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mataney/BootstrappingRelationExtractors
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.