Tokenization Multiplicity Leads to Arbitrary Price Variation in LLM-as-a-service
Ivi Chatzi, Nina Corvelo Benz, Stratis Tsirtsis, and Manuel Gomez-Rodriguez

TL;DR
This paper reveals that tokenization multiplicity causes arbitrary price variations in LLM-as-a-service and proposes canonical generation with an efficient sampling algorithm to ensure consistent tokenization and pricing.
Contribution
It introduces canonical generation to restrict LLMs to unique tokenizations, addressing price inconsistency caused by tokenization multiplicity.
Findings
Canonical generation effectively eliminates tokenization multiplicity.
The sampling algorithm is comparable to standard methods in performance and runtime.
The approach ensures consistent pricing for LLM outputs.
Abstract
Providers of LLM-as-a-service have predominantly adopted a simple pricing model: users pay a fixed price per token. Consequently, one may think that the price two different users would pay for the same output string under the same input prompt is the same. In our work, we show that, surprisingly, this is not (always) true. We find empirical evidence that, particularly for non-english outputs, both proprietary and open-weights LLMs often generate the same (output) string with multiple different tokenizations, even under the same input prompt, and this in turn leads to arbitrary price variation. To address the problem of tokenization multiplicity, we introduce canonical generation, a type of constrained generation that restricts LLMs to only generate canonical tokenizations -- the unique tokenization in which each string is tokenized during the training process of an LLM. Further, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Semantic Web and Ontologies · Advanced Text Analysis Techniques
