Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models
Junpeng Li, Zixia Jia, Zilong Zheng

TL;DR
This paper introduces a semi-automatic method combining large language models and natural language inference to enhance document-level relation extraction datasets with minimal human effort, especially for long-tail relation types.
Contribution
The paper presents a novel approach integrating LLMs and NLI for dataset augmentation in document-level relation extraction, addressing challenges of fine-grained relation types and uncontrolled LLM generations.
Findings
Created the DocGNRE dataset for improved relation annotation.
Demonstrated effectiveness in re-annotating long-tail relation types.
Showed potential for domain-specific relation type applications.
Abstract
Document-level Relation Extraction (DocRE), which aims to extract relations from a long context, is a critical challenge in achieving fine-grained structural comprehension and generating interpretable document representations. Inspired by recent advances in in-context learning capabilities emergent from large language models (LLMs), such as ChatGPT, we aim to design an automated annotation method for DocRE with minimum human effort. Unfortunately, vanilla in-context learning is infeasible for document-level relation extraction due to the plenty of predefined fine-grained relation types and the uncontrolled generations of LLMs. To tackle this issue, we propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples, thereby augmenting document-level relation datasets. We demonstrate the effectiveness of our approach by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
