PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Jingcheng Hu; Yinmin Zhang; Shijie Shang; Xiaobo Yang; Yue Peng; Zhewei Huang; Hebin Zhou; Xin Wu; Jie Cheng; Fanqi Wan; Xiangwen Kong; Chengyuan Yao; Kaiwen Yan; Ailin Huang; Hongyu Zhou; Qi Han; Zheng Ge; Daxin Jiang; Xiangyu Zhang; Heung-Yeung Shum

arXiv:2601.05593·cs.LG·January 12, 2026

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Jingcheng Hu, Yinmin Zhang, Shijie Shang, Xiaobo Yang, Yue Peng, Zhewei Huang, Hebin Zhou, Xin Wu, Jie Cheng, Fanqi Wan, Xiangwen Kong, Chengyuan Yao, Kaiwen Yan, Ailin Huang, Hongyu Zhou, Qi Han, Zheng Ge, Daxin Jiang, Xiangyu Zhang, Heung-Yeung Shum

PDF

Open Access 10 Models 1 Datasets

TL;DR

PaCoRe introduces a parallel reasoning framework that significantly enhances test-time compute scalability in language models, enabling multi-million-token reasoning and surpassing existing systems in mathematics benchmarks.

Contribution

The paper presents a novel parallel coordinated reasoning framework that scales test-time compute beyond traditional sequential methods using message-passing and reinforcement learning.

Findings

01

Achieves 94.5% on HMMT 2025, surpassing GPT-5.

02

Scales effective TTC to roughly two million tokens.

03

Demonstrates strong improvements across diverse domains.

Abstract

We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

stepfun-ai/PaCoRe-Train-8k
dataset· 1.1k dl
1.1k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques