Joint Consistency: A Unified Test-Time Aggregation Framework via Energy Minimization
Yunzhen Yao, Hongye Wang, Yahong Wang, Michael C. Gastpar, Bo Jiang, Lie He

TL;DR
This paper introduces Joint Consistency, a unified test-time aggregation framework using energy minimization, which improves reasoning trace aggregation in large language models.
Contribution
It formulates test-time aggregation as a constrained energy minimization problem, unifying existing methods and leveraging LLM-based comparisons for better performance.
Findings
JC outperforms existing baselines on math and code reasoning benchmarks.
The framework is effective across various tasks, models, and trace settings.
An efficient approximation makes large-scale implementation feasible.
Abstract
This paper studies test-time aggregation, an approach that generates multiple reasoning traces and aggregates them into a final answer. Most existing methods rely on evaluation signals collected from candidate traces in isolation or answer frequencies, while ignoring comparative interactions among candidates. We propose Joint Consistency (JC), formulated as a constrained Ising-type energy minimization problem, where independent evaluation signals act as external fields and pairwise comparisons act as interactions. JC provides a unified framework for test-time aggregation that subsumes existing voting and weighted aggregation methods as special cases. Our construction of the interaction matrix leverages LLM-as-a-judge comparisons, and admits a theoretical interpretation under answer-level homogeneity assumptions. Moreover, we develop an efficient approximation strategy that makes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
