When Is the Same Model Not the Same Service? A Measurement Study of Hosted Open-Weight LLM APIs

Haorui Li; Zhenghui He; Xuanzi Liu; Yang Xu; Dongsheng Liu; Jiakang Ma; Lupan Wu; Yangjie Wu; Xiongchao Tang; and Tianhui Shi

arXiv:2605.02821·cs.PF·May 8, 2026

When Is the Same Model Not the Same Service? A Measurement Study of Hosted Open-Weight LLM APIs

Haorui Li, Zhenghui He, Xuanzi Liu, Yang Xu, Dongsheng Liu, Jiakang Ma, Lupan Wu, Yangjie Wu, Xiongchao Tang, and Tianhui Shi

PDF

1 Repo

TL;DR

This study analyzes how hosted open-weight LLM APIs function as dynamic, provider-specific services rather than static models, revealing demand patterns, provider behavior, and task-dependent provider choices.

Contribution

It introduces a measurement methodology and empirical analysis showing that LLM APIs are heterogeneous, evolving services influenced by provider and task-specific factors.

Findings

01

Demand is concentrated but persistent across versions.

02

Provider listing breadth does not guarantee adoption.

03

Task type influences provider choice and performance.

Abstract

Open-weight large language models (LLMs) are usually named as model artifacts, but production users often consume them as hosted API services. This paper argues that the operational unit is a service object: a provider-specific, time-varying endpoint defined by model variant, protocol behavior, context capacity, listed price, latency and throughput distribution, reliability, and task feasibility. Using sampled request logs, provider metadata, compatibility probes, pricing snapshots, and continuous latency measurements collected by AI Ping during Q4 2025, we study how this service layer changes the meaning of "the same model." Three empirical patterns emerge. First, observed demand is concentrated but persistent across versions: in the displayed family aggregate, the largest family carries 32.0% of relative demand and the top five carry 87.4%, with a Gini coefficient of 0.693, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haoruilee/llm_api_measurement_study
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.