ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process   Rewarding

Zhongxiang Sun; Qipeng Wang; Weijie Yu; Xiaoxue Zang; Kai Zheng; Jun; Xu; Xiao Zhang; Song Yang; Han Li

arXiv:2501.07861·cs.CL·January 15, 2025

ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding

Zhongxiang Sun, Qipeng Wang, Weijie Yu, Xiaoxue Zang, Kai Zheng, Jun, Xu, Xiao Zhang, Song Yang, Han Li

PDF

Open Access 1 Repo

TL;DR

ReARTeR enhances retrieval-augmented reasoning in large language models by integrating trustworthy process reward mechanisms, natural language explanations, and search strategies to improve multi-step reasoning accuracy and address bias issues.

Contribution

This paper introduces ReARTeR, a novel framework combining post-training and test-time methods to improve reasoning in RAG systems, including new reward models and bias mitigation techniques.

Findings

01

Significant performance improvements on multi-step reasoning benchmarks.

02

Effective bias mitigation in process reward training.

03

Enhanced step-level reasoning through Monte Carlo Tree Search.

Abstract

Retrieval-Augmented Generation (RAG) systems for Large Language Models (LLMs) hold promise in knowledge-intensive tasks but face limitations in complex multi-step reasoning. While recent methods have integrated RAG with chain-of-thought reasoning or test-time search using Process Reward Models (PRMs), these approaches encounter challenges such as a lack of explanations, bias in PRM training data, early-step bias in PRM scores, and insufficient post-training optimization of reasoning potential. To address these issues, we propose Retrieval-Augmented Reasoning through Trustworthy Process Rewarding (ReARTeR), a framework that enhances RAG systems' reasoning capabilities through post-training and test-time scaling. At test time, ReARTeR introduces Trustworthy Process Rewarding via a Process Reward Model for accurate scalar scoring and a Process Explanation Model (PEM) for generating natural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RUCAIBox/R1-Searcher
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Semantic Web and Ontologies · Business Process Modeling and Analysis

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Linear Warmup With Linear Decay · WordPiece · Attention Dropout · Adam · Residual Connection · Dropout