LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Yansheng Mao, Jiaqi Li, Fanxu Meng, Jing Xiong, Zilong Zheng, Muhan, Zhang

TL;DR
This paper presents LIFT, a novel fine-tuning framework that significantly enhances large language models' ability to understand and process long contexts efficiently, without extensive retraining.
Contribution
LIFT introduces a flexible, efficient method for improving long-context understanding in LLMs by adapting model parameters at test time, combining in-context learning and supervised fine-tuning.
Findings
LIFT improves performance on long-context benchmarks like LooGLE and LongBench.
LIFT enables short-context models to handle arbitrarily long inputs.
The framework offers a scalable solution without high computational costs.
Abstract
Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling, a novel framework that enhances LLM performance on long-context tasks by adapting model parameters to the context at test time. LIFT enables efficient processing of lengthy inputs without the computational burden of offline long-context adaptation, and can improve the long-context capabilities of arbitrary short-context models. The framework is further enhanced by integrating in-context learning and pre-LIFT supervised fine-tuning. The combination of in-context learning and LIFT enables short-context models like Llama 3 to handle arbitrarily long contexts and consistently improves their performance on popular long-context benchmarks like LooGLE and LongBench. We also provide a comprehensive analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems
MethodsLLaMA
