RUIE: Retrieval-based Unified Information Extraction using Large   Language Model

Xincheng Liao; Junwen Duan; Yixi Huang; Jianxin Wang

arXiv:2409.11673·cs.CL·January 22, 2025

RUIE: Retrieval-based Unified Information Extraction using Large Language Model

Xincheng Liao, Junwen Duan, Yixi Huang, Jianxin Wang

PDF

Open Access 1 Repo

TL;DR

RUIE is a retrieval-based framework that enhances large language models for unified information extraction by combining in-context learning with a novel retrieval mechanism, improving efficiency and generalization across tasks.

Contribution

It introduces the first trainable retrieval framework for UIE, integrating a novel demonstration selection mechanism and a bi-encoder retriever trained via contrastive learning and knowledge distillation.

Findings

01

Achieves significant F1-score improvements over instruction-tuning methods.

02

Effectively generalizes to unseen datasets with high accuracy.

03

Serves as a universal plugin for various large language models.

Abstract

Unified information extraction (UIE) aims to extract diverse structured information from unstructured text. While large language models (LLMs) have shown promise for UIE, they require significant computational resources and often struggle to generalize to unseen tasks. We propose RUIE (Retrieval-based Unified Information Extraction), a framework that leverages in-context learning for efficient task generalization. RUIE introduces a novel demonstration selection mechanism combining LLM preferences with a keyword-enhanced reward model, and employs a bi-encoder retriever trained through contrastive learning and knowledge distillation. As the first trainable retrieval framework for UIE, RUIE serves as a universal plugin for various LLMs. Experimental results on eight held-out datasets demonstrate RUIE's effectiveness, with average F1-score improvements of 19.22 and 3.22 compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ostars/ruie
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques

MethodsContrastive Learning