Unraveling Token Prediction Refinement and Identifying Essential Layers in Language Models

Jaturong Kongmanee

arXiv:2501.15054·cs.CL·June 10, 2025

Unraveling Token Prediction Refinement and Identifying Essential Layers in Language Models

Jaturong Kongmanee

PDF

Open Access

TL;DR

This paper investigates how large language models refine token predictions internally, revealing the influence of input position on prediction refinement depth and identifying which layers are crucial for accurate outputs.

Contribution

It introduces a logit lens analysis to understand token prediction refinement and identifies essential layers, highlighting the impact of input positioning on model behavior.

Findings

01

Prediction refinement depth varies with input position, forming an inverted U-shape.

02

Relevant information at input edges reduces the need for multiple refinements.

03

Not all layers are equally important for final token prediction accuracy.

Abstract

This research aims to unravel how large language models (LLMs) iteratively refine token predictions through internal processing. We utilized a logit lens technique to analyze the model's token predictions derived from intermediate representations. Specifically, we focused on (1) how LLMs access and utilize information from input contexts, and (2) how positioning of relevant information affects the model's token prediction refinement process. On a multi-document question answering task with varying input context lengths, we found that the depth of prediction refinement (defined as the number of intermediate layers an LLM uses to transition from an initial correct token prediction to its final, stable correct output), as a function of the position of relevant information, exhibits an approximately inverted U-shaped curve. We also found that the gap between these two layers, on average,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Softmax · Adam · Residual Connection · Dropout · Byte Pair Encoding · Linear Layer · Attention Dropout