Anytime Safe PAC Efficient Reasoning
Chengyao Yu, Hao Zeng, Youxin Zhu, Jianguo Huang, Huajun Zeng, Bingyi Jing

TL;DR
This paper introduces B-PAC reasoning, a method for safe, efficient online reasoning with large models that adaptively balances performance and computational cost using statistical evidence, even with partial feedback.
Contribution
It proposes a novel B-PAC framework that guarantees safety and efficiency in online reasoning by dynamically adjusting thresholds based on statistical evidence.
Findings
Reduces thinking model usage by up to 81.01%.
Controls performance loss below user-specified levels.
Establishes theoretical guarantees for safety and efficiency.
Abstract
Large Reasoning Models (LRMs) have demonstrated remarkable performance on complex tasks but suffer from high computational costs and latency. While selective thinking strategies improve efficiency by routing easy queries to non-thinking models, existing approaches often incur uncontrollable errors, especially in online settings where the performance loss of a non-thinking model is only partially observed and data are non-stationary. To address this, we propose Betting Probably Approximately Correct (B-PAC) reasoning, a principled method that enables anytime safe and efficient online reasoning under partial feedback. Specifically, we utilize inverse propensity scoring estimators to construct test supermartingales for candidate thresholds, and then dynamically adjust the routing threshold based on the accumulated statistical evidence of safety. Theoretically, we establish the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Advanced Graph Neural Networks
