Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Jiajun Zhu; Peihao Wang; Ruisi Cai; Jason D. Lee; Pan Li; Zhangyang Wang

arXiv:2501.00712·cs.CL·August 22, 2025

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason D. Lee, Pan Li, Zhangyang Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces TAPE, a new framework for positional encoding in transformers that incorporates sequence content to improve long-range reasoning and task adaptability, with provable benefits and practical efficiency.

Contribution

TAPE provides a dynamic, content-aware positional encoding method that enhances transformer performance and reasoning capabilities, surpassing traditional fixed-position techniques.

Findings

01

Improves long-context reasoning in language models.

02

Enhances arithmetic reasoning and retrieval tasks.

03

Achieves superior performance over existing positional encodings.

Abstract

Transformers rely on both content-based and position-based addressing mechanisms to make predictions, but existing positional encoding techniques often diminish the effectiveness of position-based addressing. Many current methods enforce rigid patterns in attention maps, limiting the ability to model long-range dependencies and adapt to diverse tasks. Additionally, most positional encodings are learned as general biases, lacking the specialization required for different instances within a dataset. To address this, we propose con\textbf{T}extualized equivari\textbf{A}nt \textbf{P}osition \textbf{E}ncoding (\textbf{TAPE}), a novel framework that enhances positional embeddings by incorporating sequence content across layers. TAPE introduces dynamic, context-aware positional encodings, overcoming the constraints of traditional fixed patterns. We show that TAPE can provably facilitate LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vita-group/tape
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling

MethodsSoftmax · Attention Is All You Need