Test-Time Compute Games
Ander Artola Velasco, Dimitrios Rontogiannis, Stratis Tsirtsis, Manuel Gomez-Rodriguez

TL;DR
This paper examines the inefficiency in test-time compute pricing for LLMs and proposes a reverse second-price auction mechanism to align provider incentives with social welfare.
Contribution
It introduces a novel auction-based mechanism to reduce social inefficiency in test-time compute pricing for large language models.
Findings
Market of LLM-as-a-service is socially inefficient due to provider incentives.
Proposed auction mechanism aligns provider incentives with social welfare.
Experimental results with Llama, Qwen, and DeepSeek-R1 models demonstrate effectiveness.
Abstract
Test-time compute has emerged as a promising strategy to enhance the reasoning abilities of large language models (LLMs). However, this strategy has in turn increased how much users pay cloud-based providers offering LLM-as-a-service, since providers charge users for the amount of test-time compute they use to generate an output. In our work, we show that the market of LLM-as-a-service is socially inefficient: providers have a financial incentive to increase the amount of test-time compute, even if this increase contributes little to the quality of the outputs. To address this inefficiency, we introduce a reverse second-price auction mechanism where providers bid their offered price and (expected) quality for the opportunity to serve a user, and users pay proportionally to the marginal value generated by the winning provider relative to the second-highest bidder. To illustrate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
