Is There a Case for Conversation Optimized Tokenizers in Large Language Models?

Raquel Ferrando; Javier Conde; Gonzalo Mart\'inez; Pedro Reviriego

arXiv:2506.18674·cs.CL·June 24, 2025

Is There a Case for Conversation Optimized Tokenizers in Large Language Models?

Raquel Ferrando, Javier Conde, Gonzalo Mart\'inez, Pedro Reviriego

PDF

TL;DR

This paper investigates whether customizing tokenizers for chatbot conversations can improve efficiency, finding that conversation-optimized tokenizers reduce token counts in dialogues by 5-10%, potentially saving energy without harming training corpus performance.

Contribution

It introduces conversation-specific tokenizer optimization for chatbots and demonstrates its benefits in reducing token counts and energy consumption.

Findings

01

Conversation-optimized tokenizers reduce dialogue tokens by 5-10%.

02

Energy savings are achievable through specialized tokenization.

03

Minimal or positive impact on training corpus tokenization efficiency.

Abstract

The computational and energy costs of Large Language Models (LLMs) have increased exponentially driven by the growing model sizes and the massive adoption of LLMs by hundreds of millions of users. The unit cost of an LLM is the computation of a token. Therefore, the tokenizer plays an important role in the efficiency of a model, and they are carefully optimized to minimize the number of tokens for the text in their training corpus. One of the most popular applications of LLMs are chatbots that interact with users. A key observation is that, for those chatbots, what is important is the performance of the tokenizer in the user text input and the chatbot responses. Those are most likely different from the text in the training corpus. So, a question that immediately arises is whether there is a potential benefit in optimizing tokenizers for chatbot conversations. In this paper, this idea is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.