Instruct and Extract: Instruction Tuning for On-Demand Information   Extraction

Yizhu Jiao; Ming Zhong; Sha Li; Ruining Zhao; Siru Ouyang; Heng Ji,; Jiawei Han

arXiv:2310.16040·cs.CL·October 25, 2023·1 cites

Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji,, Jiawei Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new instruction-tuning paradigm for personalized, on-demand information extraction that allows users to specify extraction tasks and formats, supported by a new benchmark and a specialized model.

Contribution

It proposes the On-Demand Information Extraction paradigm, creates the InstructIE benchmark, and develops the ODIE model, advancing personalized extraction capabilities in NLP.

Findings

01

ODIE outperforms existing open-source models of similar size.

02

The benchmark facilitates research in personalized information extraction.

03

The approach enables flexible, user-specified extraction tasks.

Abstract

Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yzjiao/on-demand-ie
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Web Data Mining and Analysis · Data Quality and Management

MethodsHigh-Order Consensuses · ALIGN