TL;DR
This paper introduces ARK, a novel retriever fine-tuning framework that enhances answer alignment in retrieval-augmented generation by leveraging curriculum learning with knowledge graph augmentation.
Contribution
It proposes a new fine-tuning method using curriculum contrastive learning with KG-augmented queries to improve answer-centric retrieval performance.
Findings
Achieved state-of-the-art results on 10 datasets from Ultradomain and LongBench.
Improved retrieval accuracy by 14.5% over the base model.
Maintained efficiency for long-context retrieval tasks.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a powerful framework for knowledge-intensive tasks, yet its effectiveness in long-context scenarios is often bottlenecked by the retriever's inability to distinguish sparse yet crucial evidence. Standard retrievers, optimized for query-document similarity, frequently fail to align with the downstream goal of generating a precise answer. To bridge this gap, we propose a novel fine-tuning framework that optimizes the retriever for Answer Alignment. Specifically, we first identify high-quality positive chunks by evaluating their sufficiency to generate the correct answer. We then employ a curriculum-based contrastive learning scheme to fine-tune the retriever. This curriculum leverages LLM-constructed Knowledge Graphs (KGs) to generate augmented queries, which in turn mine progressively challenging hard negatives. This process trains the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
