Weight-based Analysis of Detokenization in Language Models:   Understanding the First Stage of Inference Without Inference

Go Kamoda; Benjamin Heinzerling; Tatsuro Inaba; Keito Kudo; Keisuke; Sakaguchi; Kentaro Inui

arXiv:2501.15754·cs.CL·February 11, 2025

Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Go Kamoda, Benjamin Heinzerling, Tatsuro Inaba, Keito Kudo, Keisuke, Sakaguchi, Kentaro Inui

PDF

Open Access 2 Videos

TL;DR

This paper presents a weight-based analytical approach to understand the detokenization process in GPT-2, revealing how model weights influence attention biases without performing inference.

Contribution

It introduces a novel decomposition of first-layer attention in GPT-2 that explains detokenization effects purely through weight analysis, bypassing inference-based methods.

Findings

01

Weight-based explanations reveal attention bias toward close tokens.

02

Decomposition quantifies contributions of position, token, and mixed effects.

03

Analysis enhances understanding of early-stage language model processing.

Abstract

According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that form the model's "inner vocabulary". Prior analysis of this detokenization stage has predominantly relied on probing and interventions such as path patching, which involve selecting particular inputs, choosing a subset of components that will be patched, and then observing changes in model behavior. Here, we show that several important aspects of the detokenization stage can be understood purely by analyzing model weights, without performing any model inference steps. Specifically, we introduce an analytical decomposition of first-layer attention in GPT-2. Our decomposition yields interpretable terms that quantify the relative contributions of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference· underline

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Softmax · Adam · Residual Connection · Dropout · Byte Pair Encoding · Linear Layer · Attention Dropout