AmalREC: A Dataset for Relation Extraction and Classification Leveraging Amalgamation of Large Language Models
Mansi, Pranshu Pandya, Mahek Bhavesh Vora, Soumya Bharadwaj, Ashish, Anand

TL;DR
This paper introduces AmalREC, a large-scale, diverse relation extraction dataset generated using a multi-stage LLM-based pipeline, and proposes a novel evaluation and amalgamation method to enhance sentence quality for relation classification tasks.
Contribution
It presents a new framework for generating high-quality relation sentences using LLMs, introduces the Sentence Evaluation Index and SEI-Ranker, and provides a comprehensive benchmark dataset with diverse relation types.
Findings
The dataset contains 255 relation types with 15K test and 150K train sentences.
The proposed methods improve sentence quality and diversity for relation extraction.
Evaluation shows competitive performance against state-of-the-art baselines.
Abstract
Existing datasets for relation classification and extraction often exhibit limitations such as restricted relation types and domain-specific biases. This work presents a generic framework to generate well-structured sentences from given tuples with the help of Large Language Models (LLMs). This study has focused on the following major questions: (i) how to generate sentences from relation tuples, (ii) how to compare and rank them, (iii) can we combine strengths of individual methods and amalgamate them to generate an even bette quality of sentences, and (iv) how to evaluate the final dataset? For the first question, we employ a multifaceted 5-stage pipeline approach, leveraging LLMs in conjunction with template-guided generation. We introduce Sentence Evaluation Index(SEI) that prioritizes factors like grammatical correctness, fluency, human-aligned sentiment, accuracy, and complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsSparse Evolutionary Training
