VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Zigeng Chen; Xinyin Ma; Gongfan Fang; Ruonan Yu; Xinchao Wang

arXiv:2505.17941·cs.LG·May 26, 2025

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Zigeng Chen, Xinyin Ma, Gongfan Fang, Ruonan Yu, Xinchao Wang

PDF

1 Repo 1 Models 1 Datasets

TL;DR

VeriThinker introduces a verification-based fine-tuning method for large reasoning models, significantly reducing reasoning chain length and inference costs while maintaining or improving accuracy, including zero-shot generalization.

Contribution

The paper proposes a novel CoT compression approach by fine-tuning LRMs through an auxiliary verification task, avoiding synthetic data generation.

Findings

01

Reduces reasoning tokens by over 40% on MATH500 and AIME25 datasets.

02

Achieves slight accuracy improvements while decreasing inference costs.

03

Demonstrates zero-shot generalization to speculative reasoning tasks.

Abstract

Large Reasoning Models (LRMs) excel at complex tasks using Chain-of-Thought (CoT) reasoning. However, their tendency to overthinking leads to unnecessarily lengthy reasoning chains, dramatically increasing inference costs. To mitigate this issue, we introduce VeriThinker, a novel approach for CoT compression. Unlike conventional methods that fine-tune LRMs directly on the original reasoning task using synthetic concise CoT data, we innovatively fine-tune the model solely through an auxiliary verification task. By training LRMs to accurately verify the correctness of CoT solutions, the LRMs inherently become more discerning about the necessity of subsequent self-reflection steps, thereby effectively suppressing overthinking. Extensive experiments validate that VeriThinker substantially reduces reasoning chain lengths while maintaining or even slightly improving accuracy. When applied to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

czg1225/verithinker
noneOfficial

Models

🤗
Zigeng/R1-VeriThinker-7B
model· 9 dl· ♡ 5
9 dl♡ 5

Datasets

Zigeng/CoT-Verification-340k
dataset· 59 dl
59 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.