Say Anything but This: When Tokenizer Betrays Reasoning in LLMs

Navid Ayoobi; Marcus I Armstrong; Arjun Mukherjee

arXiv:2601.14658·cs.CL·January 22, 2026

Say Anything but This: When Tokenizer Betrays Reasoning in LLMs

Navid Ayoobi, Marcus I Armstrong, Arjun Mukherjee

PDF

Open Access

TL;DR

This paper reveals that subword tokenizers can cause LLM reasoning failures due to representational mismatches, leading to phantom edits and systematic artifacts that undermine model reasoning accuracy.

Contribution

The study introduces a tokenizer consistency probe and provides a taxonomy of eight systematic tokenizer artifacts affecting LLM reasoning.

Findings

01

Over 11,000 replacement trials analyzed across state-of-the-art LLMs.

02

Non-trivial rate of phantom edits caused by tokenizer artifacts.

03

Identification of systematic tokenizer artifacts like whitespace shifts and intra-word resegmentation.

Abstract

Large language models (LLMs) reason over discrete token ID sequences, yet modern subword tokenizers routinely produce non-unique encodings: multiple token ID sequences can detokenize to identical surface strings. This representational mismatch creates an unmeasured fragility wherein reasoning processes can fail. LLMs may treat two internal representations as distinct "words" even when they are semantically identical at the text level. In this work, we show that tokenization can betray LLM reasoning through one-to-many token ID mappings. We introduce a tokenization-consistency probe that requires models to replace designated target words in context while leaving all other content unchanged. The task is intentionally simple at the surface level, enabling us to attribute failures to tokenizer-detokenizer artifacts rather than to knowledge gaps or parameter limitations. Through analysis of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Text Readability and Simplification