Token Homogenization under Positional Bias

Viacheslav Yusupov; Danil Maksimov; Ameliia Alaeva; Tatiana Zaitceva; Antipina Anna; Anna Vasileva; Chenlin Liu; Rayuth Chheng; Danil Sazanakov; Andrey Chetvergov; Alina Ermilova; Egor Shvetsov

arXiv:2508.17126·cs.CL·August 26, 2025

Token Homogenization under Positional Bias

Viacheslav Yusupov, Danil Maksimov, Ameliia Alaeva, Tatiana Zaitceva, Antipina Anna, Anna Vasileva, Chenlin Liu, Rayuth Chheng, Danil Sazanakov, Andrey Chetvergov, Alina Ermilova, Egor Shvetsov

PDF

TL;DR

This paper explores how token representations in large language models tend to become uniform across layers, especially under positional bias, affecting model interpretability and performance.

Contribution

It provides empirical evidence linking token homogenization to positional bias and analyzes how attention mechanisms influence this phenomenon.

Findings

01

Tokens lose distinctiveness during processing.

02

Positional bias amplifies homogenization.

03

Homogenization depends on attention mechanisms.

Abstract

This paper investigates token homogenization - the convergence of token representations toward uniformity across transformer layers and its relationship to positional bias in large language models. We empirically examine whether homogenization occurs and how positional bias amplifies this effect. Through layer-wise similarity analysis and controlled experiments, we demonstrate that tokens systematically lose distinctiveness during processing, particularly when biased toward extremal positions. Our findings confirm both the existence of homogenization and its dependence on positional attention mechanisms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.