Toward General Instruction-Following Alignment for Retrieval-Augmented Generation
Guanting Dong, Xiaoshuai Song, Yutao Zhu, Runqi Qiao, Zhicheng Dou,, Ji-Rong Wen

TL;DR
This paper introduces VIF-RAG, a scalable pipeline for instruction-following alignment in RAG systems, and the FollowRAG Benchmark, to evaluate and improve LLM performance in instruction adherence within retrieval-augmented tasks.
Contribution
It presents the first automated synthetic pipeline for instruction alignment in RAG and introduces a comprehensive benchmark for evaluation.
Findings
VIF-RAG improves LLM performance on instruction constraints
FollowRAG Benchmark covers 22 instruction categories
Automated pipeline scales to over 100k data samples
Abstract
Following natural instructions is crucial for the effective application of Retrieval-Augmented Generation (RAG) systems. Despite recent advancements in Large Language Models (LLMs), research on assessing and improving instruction-following (IF) alignment within the RAG domain remains limited. To address this issue, we propose VIF-RAG, the first automated, scalable, and verifiable synthetic pipeline for instruction-following alignment in RAG systems. We start by manually crafting a minimal set of atomic instructions (<100) and developing combination rules to synthesize and verify complex instructions for a seed set. We then use supervised models for instruction rewriting while simultaneously generating code to automate the verification of instruction quality via a Python executor. Finally, we integrate these instructions with extensive RAG and general data samples, scaling up to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · WordPiece · Residual Connection · Linear Warmup With Linear Decay · Dropout · Layer Normalization
