Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning

Renos Zabounidis; Aditya Golatkar; Michael Kleinman; Alessandro Achille; Wei Xia; Stefano Soatto

arXiv:2511.02130·cs.AI·November 5, 2025

Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning

Renos Zabounidis, Aditya Golatkar, Michael Kleinman, Alessandro Achille, Wei Xia, Stefano Soatto

PDF

Open Access

TL;DR

Re-FORC introduces an adaptive reward prediction method that improves reasoning efficiency and accuracy in large models by enabling early stopping, optimized length selection, and dynamic scaling during inference.

Contribution

It presents Re-FORC, a novel lightweight adapter for reward prediction that enhances reasoning efficiency and accuracy in large language models.

Findings

01

Reduces compute by 26% with maintained accuracy.

02

Achieves 4% higher accuracy at same compute, 55% less compute at same accuracy.

03

Increases accuracy by 11% in high compute regime and 7% in low compute regime.

Abstract

We propose Re-FORC, an adaptive reward prediction method that, given a context, enables prediction of the expected future rewards as a function of the number of future thinking tokens. Re-FORC trains a lightweight adapter on reasoning models, demonstrating improved prediction with longer reasoning and larger models. Re-FORC enables: 1) early stopping of unpromising reasoning chains, reducing compute by 26% while maintaining accuracy, 2) optimized model and thinking length selection that achieves 4% higher accuracy at equal compute and 55% less compute at equal accuracy compared to the largest model, 3) adaptive test-time scaling, which increases accuracy by 11% in high compute regime, and 7% in low compute regime. Re-FORC allows dynamic reasoning with length control via cost-per-token thresholds while estimating computation time upfront.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications