Tiered Super-Moore's Law: Price Evolution, Production Frontiers, and Market Competition in Large Language Model Inference Services
Mingdeng Du

TL;DR
This paper analyzes the economic dynamics of large language model inference services, revealing rapid price declines driven by software innovation, a market shift around May 2024, and implications for AI accessibility and policy.
Contribution
It provides the first systematic economic analysis of token pricing in LLM inference markets, introducing the 'Tiered Super-Moore' hypothesis and detailed cost and market structure insights.
Findings
Token prices declined approximately 600-fold from 2020 to 2026.
Market inflection point identified in May 2024 marks shift to competition-driven prices.
Software and architectural innovation, not hardware, primarily drive cost reductions.
Abstract
This paper provides the first systematic economic analysis of token pricing in the large language model (LLM) inference market. Assembling a novel dataset integrating OpenRouter API data (318 models), Epoch AI records (3,237 models), and 62 cross-validated milestone observations spanning 2020-2026, we document an approximately 600-fold decline in token prices and propose the "Tiered Super-Moore" hypothesis. Economy-tier models exhibit a price half-life of 1.10 years and mid-tier models 1.55 years -- both significantly faster than Moore's Law's two-year benchmark -- while flagship models display near-zero exponential fit (R^2 = 0.031) due to a reasoning premium averaging 31.5 times non-reasoning prices. A Chow structural break test identifies May 2024 as the critical market inflection point (F = 5.74, p = 0.005), marking a transition from technology-driven to competition-driven price…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
