Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Sheridan Feucht; David Atkinson; Byron Wallace; David Bau

arXiv:2406.20086·cs.CL·October 14, 2024

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Sheridan Feucht, David Atkinson, Byron Wallace, David Bau

PDF

Open Access 1 Models

TL;DR

This paper investigates how large language models process multi-token words and named entities, revealing an erasure effect in token representations and proposing a method to uncover the implicit vocabulary within these models.

Contribution

It introduces a novel method to probe the implicit vocabulary of LLMs by analyzing token representation differences across layers, highlighting the erasure phenomenon.

Findings

01

Last token representations show rapid information erasure in early layers.

02

The proposed method successfully reveals the implicit vocabulary items.

03

Empirical analysis on Llama-2-7b and Llama-3-8B models.

Abstract

LLMs process text as sequences of tokens that roughly correspond to words, where less common words are represented by multiple tokens. However, individual tokens are often semantically unrelated to the meanings of the words/concepts they comprise. For example, Llama-2-7b's tokenizer splits the word "northeastern" into the tokens ['_n', 'ort', 'he', 'astern'], none of which correspond to semantically meaningful units like "north" or "east." Similarly, the overall meanings of named entities like "Neil Young" and multi-word expressions like "break a leg" cannot be directly inferred from their constituent tokens. Mechanistically, how do LLMs convert such arbitrary groups of tokens into useful higher-level representations? In this work, we find that last token representations of named entities and multi-token words exhibit a pronounced "erasure" effect, where information about previous and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
sfeucht/footprints
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Artificial Intelligence in Law · Library Science and Information Systems