Chronos: Learning Temporal Dynamics of Reasoning Chains for Test-Time Scaling
Kai Zhang, Jiayi Liao, Chengpeng Li, Ziyuan Xie, Sihang Li, Xiang Wang

TL;DR
Chronos introduces a temporal reasoning scoring method for large language models that models reasoning trajectories as time series, significantly improving test-time reasoning performance with minimal computational cost.
Contribution
It proposes Chronos, a novel chronological reasoning scorer that models reasoning trajectories as time series, enhancing test-time scaling of LLMs.
Findings
Achieves 34.21% relative improvement over Pass@1 on HMMT25.
Demonstrates consistent gains across various models and benchmarks.
Operates with negligible additional computational overhead.
Abstract
Test-Time Scaling (TTS) has emerged as an effective paradigm for improving the reasoning performance of large language models (LLMs). However, existing methods -- most notably majority voting and heuristic token-level scoring -- treat reasoning traces or tokens equally, thereby being susceptible to substantial variations in trajectory quality and localized logical failures. In this work, we introduce \textbf{Chronos}, a lightweight and plug-and-play chronological reasoning scorer that models each trajectory as a time series. Specifically, Chronos learns to capture trajectory features of token probabilities, assigns quality scores accordingly, and employs a weighted voting mechanism. Extensive evaluations on both in-domain and out-of-domain benchmarks demonstrate that Chronos consistently delivers substantial gains across a variety of models, with negligible computational overhead.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
