RAG-R1: Incentivizing the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism

Zhiwen Tan; Jiaming Huang; Qintong Wu; Hongxuan Zhang; Chenyi Zhuang; Jinjie Gu

arXiv:2507.02962·cs.CL·January 14, 2026

RAG-R1: Incentivizing the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism

Zhiwen Tan, Jiaming Huang, Qintong Wu, Hongxuan Zhang, Chenyi Zhuang, Jinjie Gu

PDF

Open Access 1 Repo 2 Models

TL;DR

RAG-R1 introduces a multi-query parallelism training framework for LLMs, enhancing reasoning robustness and reducing latency by enabling adaptive knowledge leveraging during inference.

Contribution

The paper presents a novel two-stage training framework that shifts from single-query to multi-query parallelism, improving reasoning and efficiency in LLMs.

Findings

01

Outperforms baseline by up to 13.7% on QA benchmarks

02

Reduces inference time by 11.1%

03

Enhances reasoning robustness with multi-query parallelism

Abstract

Large Language Models (LLMs), despite their remarkable capabilities, are prone to generating hallucinated or outdated content due to their static internal knowledge. While Retrieval-Augmented Generation (RAG) integrated with Reinforcement Learning (RL) offers a solution, these methods are fundamentally constrained by a single-query mode, leading to prohibitive latency and inherent brittleness. To overcome these limitations, we introduce RAG-R1, a novel two-stage training framework centered around multi-query parallelism. Our framework enables LLMs to adaptively leverage internal and external knowledge during the reasoning process while transitioning from the single-query mode to multi-query parallelism. This architectural shift bolsters reasoning robustness while significantly reducing inference latency. Extensive experiments on seven question-answering benchmarks confirm the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inclusionai/aworld
none

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Information Systems · Semantic Web and Ontologies · Natural Language Processing Techniques