ADELIE: Aligning Large Language Models on Information Extraction

Yunjia Qi; Hao Peng; Xiaozhi Wang; Bin Xu; Lei Hou; Juanzi Li

arXiv:2405.05008·cs.CL·October 25, 2024·1 cites

ADELIE: Aligning Large Language Models on Information Extraction

Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

PDF

Open Access 1 Repo 6 Models 1 Video

TL;DR

ADELIE is a new aligned large language model specifically trained on information extraction tasks, achieving state-of-the-art results and demonstrating strong general capabilities across various IE datasets.

Contribution

The paper introduces ADELIE, the first aligned LLM tailored for information extraction, with a new high-quality IE instruction dataset and improved training methods.

Findings

01

Achieves state-of-the-art performance on IE datasets.

02

Maintains strong general capabilities across tasks.

03

Provides open-source code, data, and models.

Abstract

Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effectively solves various IE tasks, including closed IE, open IE, and on-demand IE. We first collect and construct a high-quality alignment corpus IEInstruct for IE. Then we train ADELIE_SFT using instruction tuning on IEInstruct. We further train ADELIE_SFT with direct preference optimization (DPO) objective, resulting in ADELIE_DPO. Extensive experiments on various held-out IE datasets demonstrate that our models (ADELIE_SFT and ADELIE_DPO) achieve state-of-the-art (SoTA) performance among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

THU-KEG/ADELIE
pytorchOfficial

Models

Videos

ADELIE: Aligning Large Language Models on Information Extraction· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling