Towards Infinite Length Extrapolation: A Unified Approach

Nitin Vetcha

arXiv:2601.06113·cs.AI·January 13, 2026

Towards Infinite Length Extrapolation: A Unified Approach

Nitin Vetcha

PDF

Open Access

TL;DR

This paper introduces a unified framework for positional encoding in large language models, proposes Adaptive Positional Encoding (APE) for better long-range dependency handling, and demonstrates its effectiveness on datasets with extremely long sequences.

Contribution

It presents a unified reinterpretation of positional encoding methods, introduces APE with adaptive frequency modulation, and provides theoretical conditions for infinite-length extrapolation.

Findings

01

APE enables models to process sequences up to 32,000 words.

02

The framework unifies and generalizes existing positional encoding methods.

03

Theoretical analysis guarantees well-defined normalization over unbounded sequences.

Abstract

Large language models (LLMs) have revolutionized natural language processing, but their ability to process long sequences is fundamentally limited by the context window size during training. Existing length extrapolation methods often suffer from performance degradation or computational inefficiencies. We thereby use a unified framework that reinterprets positional encoding methods as a decomposition of the attention score into a multiplicative transformation and an additive bias. This perspective not only subsumes popular approaches such as relative position embeddings and attention-bias moderated approaches but also exposes their inherent limitations in handling long-range dependencies. To address these shortcomings, motivated by our framework, we introduce Adaptive Positional Encoding (APE), which leverages adaptive frequency modulation and an intricately designed decay bias that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning