Layer-wise Positional Bias in Short-Context Language Modeling

Maryam Rahimi; Mahdi Nouri; Yadollah Yaghoobzadeh

arXiv:2601.04098·cs.CL·January 8, 2026

Layer-wise Positional Bias in Short-Context Language Modeling

Maryam Rahimi, Mahdi Nouri, Yadollah Yaghoobzadeh

PDF

Open Access

TL;DR

This paper introduces an attribution-based framework to analyze how positional biases evolve across layers in short-context language models, revealing stable, architecture-specific importance profiles and biases like recency and primacy.

Contribution

It presents a novel layer-wise analysis method for positional biases in language models, uncovering stable importance profiles and their relation to model depth and architecture.

Findings

01

Recency bias increases with model depth.

02

Primacy bias diminishes through model depth.

03

Early layers favor content words over function words.

Abstract

Language models often show a preference for using information from specific positions in the input regardless of semantic relevance. While positional bias has been studied in various contexts, from attention sinks to task performance degradation in long-context settings, prior work has not established how these biases evolve across individual layers and input positions, or how they vary independent of task complexity. We introduce an attribution-based framework to analyze positional effects in short-context language modeling. Using layer conductance with a sliding-window approach, we quantify how each layer distributes importance across input positions, yielding layer-wise positional importance profiles. We find that these profiles are architecture-specific, stable across inputs, and invariant to lexical scrambling. Characterizing these profiles, we find prominent recency bias that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Multimodal Machine Learning Applications