UR$^2$: Unify RAG and Reasoning through Reinforcement Learning

Weitao Li; Boran Xiang; Xiaolong Wang; Zhinan Gou; Weizhi Ma; Yang Liu

arXiv:2508.06165·cs.CL·April 27, 2026

UR$^2$: Unify RAG and Reasoning through Reinforcement Learning

Weitao Li, Boran Xiang, Xiaolong Wang, Zhinan Gou, Weizhi Ma, Yang Liu

PDF

1 Repo

TL;DR

UR$^2$ is a reinforcement learning framework that unifies retrieval-augmented generation and reasoning, enhancing large language models' performance across diverse tasks by dynamically coordinating retrieval and reasoning strategies.

Contribution

The paper introduces UR$^2$, a novel framework that combines retrieval and reasoning with a difficulty-aware curriculum and hybrid knowledge access, improving robustness and generalization.

Findings

01

UR$^2$ outperforms existing RAG and RL baselines on multiple tasks.

02

UR$^2$ achieves performance comparable to GPT-4 on several benchmarks.

03

The code for UR$^2$ is publicly available at the provided GitHub link.

Abstract

Large Language Models (LLMs) have shown strong capabilities through two complementary paradigms: Retrieval-Augmented Generation (RAG) for knowledge grounding and Reinforcement Learning from Verifiable Rewards (RLVR) for complex reasoning. However, existing attempts to unify these paradigms remain narrow in scope, typically limited to open-domain QA with fixed retrieval settings, which constrains generalization to broader domains. To address this limitation, we propose UR $^{2}$ (Unified RAG and Reasoning)), a general reinforcement learning framework that dynamically coordinates retrieval and reasoning. UR $^{2}$ introduces two key designs: a difficulty-aware curriculum that selectively invokes retrieval only for challenging instances, and a hybrid knowledge access strategy that combines domain-specific offline corpora with on-the-fly LLM-generated summaries. Together, these components…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Tsinghua-dhy/UR2
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.