SELF: Self-Extend the Context Length With Logistic Growth Function

Phat Thanh Dang; Saahil Thoppay; Wang Yang; Qifan Wang; Vipin Chaudhary; Xiaotian Han

arXiv:2505.17296·cs.CL·May 26, 2025

SELF: Self-Extend the Context Length With Logistic Growth Function

Phat Thanh Dang, Saahil Thoppay, Wang Yang, Qifan Wang, Vipin Chaudhary, Xiaotian Han

PDF

1 Repo

TL;DR

SELF introduces a logistic growth-based method to extend language models' effective context length, improving performance on long-context tasks by up to 12%.

Contribution

The paper proposes a novel context extension technique using a logistic capacity function, outperforming existing methods like LongLM on multiple benchmarks.

Findings

01

Up to 12% performance increase on LEval tasks.

02

Up to 6.4% improvement on LongBench summarization tasks.

03

Up to 5.4% better results on LEval reading comprehension.

Abstract

Large language models suffer issues when operated on long contexts that are larger than their training context length due to the standard position encoding for tokens in the attention layer. Tokens a long distance apart will rarely have an effect on each other and long prompts yield unexpected results. To solve this problem, we propose SELF (Self-Extend the Context Length With Logistic Growth Function): a solution of grouping consecutive tokens at varying group sizes using a logistic capacity equation combined with a constant group size at smaller relative distances. Our model had an increase in performance of up to 12% compared to the LongLM extension method in LEval (specifically on the Qwen model). On summarization related tasks in LongBench, our model performed up to 6.4% better than LongLM (specifically on the Llama-2-7b model). On reading comprehension tasks from LEval, our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alexeipc/self-llm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need