GenIE: Generative Information Extraction
Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, Robert, West

TL;DR
GenIE introduces an end-to-end autoregressive model for closed information extraction that leverages pre-trained transformers, ensuring consistency with knowledge base schemas and scalability to large entity and relation sets.
Contribution
It is the first to formulate closed information extraction as an autoregressive generative task, improving scalability, accuracy, and data efficiency over previous pipeline approaches.
Findings
State-of-the-art performance on closed information extraction
Generalizes well with fewer training data
Scales to large numbers of entities and relations
Abstract
Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations from a knowledge base schema. Most existing works are pipelines prone to error accumulation, and all approaches are only applicable to unrealistically small numbers of entities and relations. We introduce GenIE (generative information extraction), the first end-to-end autoregressive formulation of closed information extraction. GenIE naturally exploits the language knowledge from the pre-trained transformer by autoregressively generating relations and entities in textual form. Thanks to a new bi-level constrained generation strategy, only triplets consistent with the predefined knowledge base schema are produced. Our experiments show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
MethodsBalanced Selection
