Compute-Accuracy Pareto Frontiers for Open-Source Reasoning Large Language Models

\'Akos Prucs; M\'arton Csutora; M\'aty\'as Antal; M\'ark Marosi

arXiv:2512.24776·cs.CL·January 1, 2026

Compute-Accuracy Pareto Frontiers for Open-Source Reasoning Large Language Models

\'Akos Prucs, M\'arton Csutora, M\'aty\'as Antal, M\'ark Marosi

PDF

Open Access

TL;DR

This paper evaluates open-source large language models on reasoning tasks considering both accuracy and computational cost, revealing a Pareto frontier and identifying the Mixture of Experts architecture as an efficient choice.

Contribution

It introduces a test-time compute-aware evaluation framework for open-source LLMs, mapping their Pareto frontiers and analyzing efficiency trends over time.

Findings

01

Mixture of Experts models balance performance and efficiency

02

Accuracy gains plateau beyond certain compute thresholds

03

Emergent trend shows improved accuracy per compute unit over time

Abstract

Large Language Models (LLMs) are demonstrating rapid improvements on complex reasoning benchmarks, particularly when allowed to utilize intermediate reasoning steps before converging on a final solution. However, current literature often overlooks the significant computational burden associated with generating long reasoning sequences. For industrial applications, model selection depends not only on raw accuracy but also on resource constraints and inference costs. In this work, we conduct a test-time-compute aware evaluation of both contemporary and older open-source LLMs, mapping their Pareto frontiers across math- and reasoning-intensive benchmarks. Our findings identify the Mixture of Experts (MoE) architecture as a strong candidate to balance performance and efficiency in our evaluation setting. Furthermore, we trace the trajectory of Pareto efficiency over time to derive an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications