Unraveling Token Prediction Refinement and Identifying Essential Layers in Language Models
Jaturong Kongmanee

TL;DR
This paper investigates how large language models refine token predictions internally, revealing the influence of input position on prediction refinement depth and identifying which layers are crucial for accurate outputs.
Contribution
It introduces a logit lens analysis to understand token prediction refinement and identifies essential layers, highlighting the impact of input positioning on model behavior.
Findings
Prediction refinement depth varies with input position, forming an inverted U-shape.
Relevant information at input edges reduces the need for multiple refinements.
Not all layers are equally important for final token prediction accuracy.
Abstract
This research aims to unravel how large language models (LLMs) iteratively refine token predictions through internal processing. We utilized a logit lens technique to analyze the model's token predictions derived from intermediate representations. Specifically, we focused on (1) how LLMs access and utilize information from input contexts, and (2) how positioning of relevant information affects the model's token prediction refinement process. On a multi-document question answering task with varying input context lengths, we found that the depth of prediction refinement (defined as the number of intermediate layers an LLM uses to transition from an initial correct token prediction to its final, stable correct output), as a function of the position of relevant information, exhibits an approximately inverted U-shaped curve. We also found that the gap between these two layers, on average,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Softmax · Adam · Residual Connection · Dropout · Byte Pair Encoding · Linear Layer · Attention Dropout
