Attention Basin: Why Contextual Position Matters in Large Language Models

Zihao Yi; Delong Zeng; Zhenqing Ling; Haohao Luo; Zhe Xu; Wei Liu; Jian Luan; Wanxia Cao; Ying Shen

arXiv:2508.05128·cs.CL·August 8, 2025

Attention Basin: Why Contextual Position Matters in Large Language Models

Zihao Yi, Delong Zeng, Zhenqing Ling, Haohao Luo, Zhe Xu, Wei Liu, Jian Luan, Wanxia Cao, Ying Shen

PDF

TL;DR

This paper uncovers the attention basin phenomenon in large language models, showing models focus more on sequence edges, and introduces AttnRank, a simple method to improve performance by reordering inputs based on attention preferences.

Contribution

The paper identifies the attention basin phenomenon and proposes AttnRank, a model-agnostic, training-free reordering method to enhance LLM performance by aligning salient information with high-attention positions.

Findings

01

AttnRank improves multi-hop QA and few-shot learning results.

02

Models exhibit a consistent attention bias towards sequence edges.

03

Reordering inputs based on attention preferences boosts performance.

Abstract

The performance of Large Language Models (LLMs) is significantly sensitive to the contextual position of information in the input. To investigate the mechanism behind this positional bias, our extensive experiments reveal a consistent phenomenon we term the attention basin: when presented with a sequence of structured items (e.g., retrieved documents or few-shot examples), models systematically assign higher attention to the items at the beginning and end of the sequence, while neglecting those in the middle. Crucially, our analysis further reveals that allocating higher attention to critical information is key to enhancing model performance. Based on these insights, we introduce Attention-Driven Reranking (AttnRank), a two-stage framework that (i) estimates a model's intrinsic positional attention preferences using a small calibration set, and (ii) reorders retrieved documents or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.