Bridging the Long-Tail Gap: Robust Retrieval-Augmented Relation Completion via Multi-Stage Paraphrase Infusion

Fahmida Alam; Mihai Surdeanu; Ellen Riloff

arXiv:2604.22261·cs.CL·April 27, 2026

Bridging the Long-Tail Gap: Robust Retrieval-Augmented Relation Completion via Multi-Stage Paraphrase Infusion

Fahmida Alam, Mihai Surdeanu, Ellen Riloff

PDF

TL;DR

This paper introduces RC-RAG, a multi-stage paraphrase-guided framework that enhances relation completion in large language models by systematically incorporating paraphrases, significantly improving performance especially on rare relations without additional fine-tuning.

Contribution

The paper presents a novel multi-stage paraphrase infusion method for relation completion that improves LLM performance on long-tail relations without requiring model fine-tuning.

Findings

01

RC-RAG outperforms several RAG baselines across datasets.

02

In long-tail settings, RC-RAG improves Exact Match by 40.6 points.

03

The method maintains low computational overhead.

Abstract

Large language models (LLMs) struggle with relation completion (RC), both with and without retrieval-augmented generation (RAG), particularly when the required information is rare or sparsely represented. To address this, we propose a novel multi-stage paraphrase-guided relation-completion framework, RC-RAG, that systematically incorporates relation paraphrases across multiple stages. In particular, RC-RAG: (a) integrates paraphrases into retrieval to expand lexical coverage of the relation, (b) uses paraphrases to generate relation-aware summaries, and (c) leverages paraphrases during generation to guide reasoning for relation completion. Importantly, our method does not require any model fine-tuning. Experiments with five LLMs on two benchmark datasets show that RC-RAG consistently outperforms several RAG baselines. In long-tail settings, the best-performing LLM augmented with RC-RAG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.