DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search

Lei Yang; Shaoyang Xu; Jianxiang Peng; Shaolin Zhu; Deyi Xiong

arXiv:2412.18811·cs.CL·November 25, 2025

DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search

Lei Yang, Shaoyang Xu, Jianxiang Peng, Shaolin Zhu, Deyi Xiong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DCIS, a divide-and-conquer algorithm for efficiently determining optimal scaling factors to extend LLMs' context length, reducing fine-tuning costs and improving performance at longer contexts.

Contribution

The paper proposes a novel DCIS algorithm that strategically searches for scaling factors, enabling effective context length extension with less fine-tuning and higher efficiency than existing methods.

Findings

01

DCIS doubles search efficiency compared to other methods.

02

The identified scaling factors improve performance at extended lengths.

03

Models can generalize to longer contexts without additional fine-tuning.

Abstract

Large language models (LLMs) based on the Transformer architecture usually have their context length limited due to the high training cost. Recent advancements extend the context window by adjusting the scaling factors of RoPE and fine-tuning. However, suboptimal initialization of these factors results in increased fine-tuning costs and reduced performance at target length. To address these challenges, we propose a novel RoPE-based fine-tuning framework that diverges from conventional scaling factors search. Specifically, we present a \textbf{D}ivide-and-\textbf{C}onquer \textbf{I}ncremental \textbf{S}earch (DCIS) algorithm that strategically determines the better scaling factors. Further fine-tuning with the identified scaling factors effectively extends the context window of LLMs. Empirical results demonstrate that our methodology not only mitigates performance decay at extended…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YL-9/dcis
pytorchOfficial

Videos

DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search· underline

Taxonomy

TopicsAlgorithms and Data Compression · Handwritten Text Recognition Techniques · Natural Language Processing Techniques

MethodsByte Pair Encoding · Linear Layer · Absolute Position Encodings · Dropout · Softmax · Attention Is All You Need · Dense Connections · Residual Connection · Multi-Head Attention · Adam