DAPE: Data-Adaptive Positional Encoding for Length Extrapolation
Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing, Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li

TL;DR
This paper introduces DAPE, a data-adaptive positional encoding method for transformers that dynamically adjusts to input data, significantly improving length extrapolation and generalization capabilities on real-world datasets.
Contribution
We propose a novel DAPE method that adapts positional encoding based on input context, enhancing length generalization in transformer models beyond static encoding approaches.
Findings
DAPE improves performance on length extrapolation tasks.
DAPE outperforms static positional encoding methods at longer sequence lengths.
The model maintains local and anti-local information effectively.
Abstract
Positional encoding plays a crucial role in transformers, significantly impacting model performance and length generalization. Prior research has introduced absolute positional encoding (APE) and relative positional encoding (RPE) to distinguish token positions in given sequences. However, both APE and RPE remain fixed after model training regardless of input data, limiting their adaptability and flexibility. Hence, we expect that the desired positional encoding should be data-adaptive and can be dynamically adjusted with the given attention. In this paper, we propose a Data-Adaptive Positional Encoding (DAPE) method, which dynamically and semantically adjusts based on input context and learned fixed priors. Experimental validation on real-world datasets (Arxiv, Books3, and CHE) demonstrates that DAPE enhances model performances in terms of trained length and length generalization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Handwritten Text Recognition Techniques · Video Analysis and Summarization
