Pricing Online LLM Services with Data-Calibrated Stackelberg Routing Game
Zhendong Guo, Wenchao Bai, Jiahui Jin

TL;DR
This paper introduces extbackslash PriLLM, a scalable, real-time pricing solution for LLM services modeled as a Stackelberg game, balancing profit maximization and computational efficiency.
Contribution
It presents a novel deep learning-based approach for dynamic LLM pricing that captures market dynamics and user preferences while ensuring scalability and interpretability.
Findings
Achieves over 95% of optimal profit in experiments.
Requires less than 5% of the computation time of optimal solutions.
Effectively models real-world market behaviors and user preferences.
Abstract
The proliferation of Large Language Models (LLMs) has established LLM routing as a standard service delivery mechanism, where users select models based on cost, Quality of Service (QoS), among other things. However, optimal pricing in LLM routing platforms requires precise modeling for dynamic service markets, and solving this problem in real time at scale is computationally intractable. In this paper, we propose \PriLLM, a novel practical and scalable solution for real-time dynamic pricing in competitive LLM routing. \PriLLM models the service market as a Stackelberg game, where providers set prices and users select services based on multiple criteria. To capture real-world market dynamics, we incorporate both objective factors (\eg~cost, QoS) and subjective user preferences into the model. For scalability, we employ a deep aggregation network to learn provider abstraction that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Natural Language Processing Techniques · Recommender Systems and Techniques
