SPARC-RAG: Adaptive Sequential-Parallel Scaling with Context Management for Retrieval-Augmented Generation
Yuxin Yang, Gangda Deng, \"Omer Faruk Akg\"ul, Nima Chitsazan, Yash Govilkar, Akasha Tigalappanavara, Shi-Xiong Zhang, Sambit Sahu, Viktor Prasanna

TL;DR
SPARC-RAG introduces a multi-agent framework that efficiently combines sequential and parallel scaling with explicit context management, significantly improving multi-hop question answering performance while reducing inference costs.
Contribution
It presents a novel multi-agent approach with context control and a fine-tuning method to optimize scaling in retrieval-augmented generation models.
Findings
Achieves +6.2 F1 improvement on QA benchmarks.
Reduces inference cost compared to previous methods.
Enhances multi-hop reasoning with targeted sub-queries.
Abstract
Retrieval-Augmented Generation (RAG) grounds large language model outputs in external evidence, but remains challenged on multi-hop question answering that requires long reasoning. Recent works scale RAG at inference time along two complementary dimensions: sequential depth for iterative refinement and parallel width for coverage expansion. However, naive scaling causes context contamination and scaling inefficiency, leading to diminishing or negative returns despite increased computation. To address these limitations, we propose SPARC-RAG, a multi-agent framework that coordinates sequential and parallel inference-time scaling under a unified context management mechanism. SPARC-RAG employs specialized agents that maintain a shared global context and provide explicit control over the scaling process. It generates targeted, complementary sub-queries for each branch to enable diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior
