IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models
Chenguang Wang, Xiao Liu, Dawn Song

TL;DR
This paper introduces IELM, a new benchmark for open information extraction using pre-trained language models, demonstrating that these models can perform zero-shot OIE with competitive results on multiple datasets.
Contribution
The paper creates a novel open information extraction benchmark for pre-trained LMs and shows they can perform zero-shot OIE effectively without training.
Findings
Pre-trained LMs achieve competitive zero-shot OIE performance.
They outperform supervised methods on factual OIE datasets.
The benchmark includes new large-scale datasets for evaluation.
Abstract
We introduce a new open information extraction (OIE) benchmark for pre-trained language models (LM). Recent studies have demonstrated that pre-trained LMs, such as BERT and GPT, may store linguistic and relational knowledge. In particular, LMs are able to answer ``fill-in-the-blank'' questions when given a pre-defined relation category. Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs. We accomplish this by turning pre-trained LMs into zero-shot OIE systems. Surprisingly, pre-trained LMs are able to obtain competitive performance on both standard OIE datasets (CaRB and Re-OIE2016) and two new large-scale factual OIE datasets (TAC KBP-OIE and Wikidata-OIE) that we establish via distant supervision. For instance, the zero-shot pre-trained LMs outperform the F1 score of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Cosine Annealing · Byte Pair Encoding · Residual Connection · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections
