ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

Zhentao Tang; Yuqi Cui; Shixiong Kai; Wenqian Zhao; Ke Ye; Xing Li; Anxin Tian; Zehua Pei; Hui-Ling Zhen; Shoubo Hu; Xiaoguang Li; Yunhe Wang; Mingxuan Yuan

arXiv:2602.04496·cs.AI·February 5, 2026

ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

Zhentao Tang, Yuqi Cui, Shixiong Kai, Wenqian Zhao, Ke Ye, Xing Li, Anxin Tian, Zehua Pei, Hui-Ling Zhen, Shoubo Hu, Xiaoguang Li, Yunhe Wang, Mingxuan Yuan

PDF

Open Access

TL;DR

ReThinker is a confidence-aware, adaptive reasoning framework for large language models that improves expert-level scientific reasoning by dynamically orchestrating tools, reflection, and multi-agent coordination.

Contribution

It introduces a novel Solver-Critic-Selector architecture with confidence-based computation allocation and a scalable training pipeline using reverse data synthesis and trajectory recycling.

Findings

01

Outperforms state-of-the-art models on HLE, GAIA, and XBench benchmarks.

02

Achieves state-of-the-art results on expert-level reasoning tasks.

03

Demonstrates robustness and efficiency in scientific reasoning scenarios.

Abstract

Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education