PIVOINE: Instruction Tuning for Open-world Information Extraction
Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu,, Jianshu Chen

TL;DR
PIVOINE is a large language model fine-tuned with instruction tuning on a new dataset, enabling effective open-world information extraction beyond predefined ontologies, outperforming traditional methods.
Contribution
The paper introduces PIVOINE, a novel LLM for open-world IE, trained on INSTRUCTOPENWIKI, demonstrating strong generalization and instruction-following capabilities.
Findings
PIVOINE outperforms traditional closed-world IE methods.
It generalizes well to unseen instructions and out-of-ontology cases.
The dataset INSTRUCTOPENWIKI enhances open-world IE training.
Abstract
We consider the problem of Open-world Information Extraction (Open-world IE), which extracts comprehensive entity profiles from unstructured texts. Different from the conventional closed-world setting of Information Extraction (IE), Open-world IE considers a more general situation where entities and relations could be beyond a predefined ontology. More importantly, we seek to develop a large language model (LLM) that is able to perform Open-world IE to extract desirable entity profiles characterized by (possibly fine-grained) natural language instructions. We achieve this by finetuning LLMs using instruction tuning. In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions. We finetune the pretrained BLOOM models on INSTRUCTOPENWIKI and obtain PIVOINE, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
MethodsBLOOM
