Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs
Sheridan Feucht, David Atkinson, Byron Wallace, David Bau

TL;DR
This paper investigates how large language models process multi-token words and named entities, revealing an erasure effect in token representations and proposing a method to uncover the implicit vocabulary within these models.
Contribution
It introduces a novel method to probe the implicit vocabulary of LLMs by analyzing token representation differences across layers, highlighting the erasure phenomenon.
Findings
Last token representations show rapid information erasure in early layers.
The proposed method successfully reveals the implicit vocabulary items.
Empirical analysis on Llama-2-7b and Llama-3-8B models.
Abstract
LLMs process text as sequences of tokens that roughly correspond to words, where less common words are represented by multiple tokens. However, individual tokens are often semantically unrelated to the meanings of the words/concepts they comprise. For example, Llama-2-7b's tokenizer splits the word "northeastern" into the tokens ['_n', 'ort', 'he', 'astern'], none of which correspond to semantically meaningful units like "north" or "east." Similarly, the overall meanings of named entities like "Neil Young" and multi-word expressions like "break a leg" cannot be directly inferred from their constituent tokens. Mechanistically, how do LLMs convert such arbitrary groups of tokens into useful higher-level representations? In this work, we find that last token representations of named entities and multi-token words exhibit a pronounced "erasure" effect, where information about previous and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Artificial Intelligence in Law · Library Science and Information Systems
