Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers

Atsushi Shimizu; Shohei Taniguchi; Yutaka Matsuo

arXiv:2602.14050·cs.LG·February 17, 2026

Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers

Atsushi Shimizu, Shohei Taniguchi, Yutaka Matsuo

PDF

Open Access 1 Video

TL;DR

This paper proposes Random Float Sampling, a novel position encoding method for transformers that improves length generalization by exposing models to diverse, continuous position values, outperforming traditional encodings on various tasks.

Contribution

Introduction of Random Float Sampling, a simple position encoding strategy that enhances length generalization in transformers by avoiding out-of-distribution issues.

Findings

01

RFS improves length generalization performance.

02

RFS enhances zero-shot commonsense reasoning.

03

Applicable to various existing position encodings.

Abstract

Length generalization is the ability of language models to maintain performance on inputs longer than those seen during pretraining. In this work, we introduce a simple yet powerful position encoding (PE) strategy, Random Float Sampling (RFS), that generalizes well to lengths unseen during pretraining or fine-tuning. In particular, instead of selecting position indices from a predefined discrete set, RFS uses randomly sampled continuous values, thereby avoiding out-of-distribution (OOD) issues on unseen lengths by exposing the model to diverse indices during training. Since assigning indices to tokens is a common and fundamental procedure in widely used PEs, the advantage of RFS can easily be incorporated into, for instance, the absolute sinusoidal encoding, RoPE, and ALiBi. Experiments corroborate its effectiveness by showing that RFS results in superior performance in length…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications