Consistency Guided Knowledge Retrieval and Denoising in LLMs for   Zero-shot Document-level Relation Triplet Extraction

Qi Sun; Kun Huang; Xiaocui Yang; Rong Tong; Kun Zhang and; Soujanya Poria

arXiv:2401.13598·cs.CL·January 25, 2024·1 cites

Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction

Qi Sun, Kun Huang, Xiaocui Yang, Rong Tong, Kun Zhang and, Soujanya Poria

PDF

Open Access 1 Repo

TL;DR

This paper introduces ZeroDocRTE, a zero-shot framework that uses retrieval and denoising of knowledge from LLMs to generate labeled data for document-level relation extraction, reducing reliance on manual annotations.

Contribution

It proposes a novel zero-shot data generation method using LLMs with a chain-of-retrieval prompt and a denoising strategy, enabling effective fine-tuning for relation extraction.

Findings

01

Outperforms strong baselines on two datasets

02

Effective zero-shot relation triplet extraction

03

Improved data quality through denoising

Abstract

Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities, inspiring us to explore an alternative approach for obtaining auto-labeled documents with new relations. In this paper, we propose a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework, which generates labeled data by retrieval and denoising knowledge from LLMs, called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide ChatGPT to generate labeled long-text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qisun123/genrdk
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques