Locational Pricing for Generative-AI Services via Token-Flow Market Clearing

Shaohui Liu

arXiv:2605.09047·cs.NI·May 12, 2026

Locational Pricing for Generative-AI Services via Token-Flow Market Clearing

Shaohui Liu

PDF

TL;DR

This paper proposes a locational token-flow market model for efficiently dispatching generative AI workloads across geographically distributed infrastructure, optimizing costs and latency.

Contribution

It introduces a network-constrained token-flow market model with transfer-aware extensions, providing a novel approach to locational pricing for AI services.

Findings

01

Transfer-aware model raises operating costs by 2.7% in a 5-node case study.

02

Locational prices can increase by 117% when reducing chatbot latency from 100ms to 15ms.

03

The model's dispatch logic remains consistent at larger scales but becomes infeasible under demand exceeding capacity.

Abstract

GenAI services are in an early yet fast expanding phase. Providers compete on model capability and service quality, while the underlying infrastructure remains expensive and heterogeneous across regions, workloads, and compute assets. If these services diffuse into routine daily use, the relevant engineering problem becomes not only better models but also efficient dispatch on a geographically distributed AI service infrastructure. To address this, we formulate a network-constrained token-flow market that clears AI workloads across compute nodes and communication links. The baseline model is a linear program that co-optimizes routing and processing subject to compute-capacity and bandwidth constraints; its dual variables define location- and workload-specific marginal service prices. We further introduce a transfer-aware extension that prices data movement in physical units and isolates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.