Are Long-LLMs A Necessity For Long-Context Tasks?
Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu, Chen, Zhicheng Dou

TL;DR
This paper challenges the necessity of long-LLMs for long-context tasks, proposing LC-Boost, a framework that enables short-LLMs to effectively handle such tasks by adaptive context access and utilization, achieving better performance with fewer resources.
Contribution
The paper introduces LC-Boost, a novel framework allowing short-LLMs to solve long-context tasks efficiently without requiring long-LLMs.
Findings
LC-Boost improves performance on long-context benchmarks.
It reduces resource consumption compared to long-LLMs.
It adaptively accesses and utilizes context for diverse tasks.
Abstract
The learning and deployment of long-LLMs remains a challenging problem despite recent progresses. In this work, we argue that the long-LLMs are not a necessity to solve long-context tasks, as common long-context tasks are short-context solvable, i.e. they can be solved by purely working with oracle short-contexts within the long-context tasks' inputs. On top of this argument, we propose a framework called LC-Boost (Long-Context Bootstrapper), which enables a short-LLM to address the long-context tasks in a bootstrapping manner. In our framework, the short-LLM prompts itself to reason for two critical decisions: 1) how to access to the appropriate part of context within the input, 2) how to make effective use of the accessed context. By adaptively accessing and utilizing the context based on the presented tasks, LC-Boost can serve as a general framework to handle diversified long-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Software System Performance and Reliability · Business Process Modeling and Analysis
