LIFT: Improving Long Context Understanding Through Long Input   Fine-Tuning

Yansheng Mao; Jiaqi Li; Fanxu Meng; Jing Xiong; Zilong Zheng; Muhan; Zhang

arXiv:2412.13626·cs.CL·December 19, 2024

LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning

Yansheng Mao, Jiaqi Li, Fanxu Meng, Jing Xiong, Zilong Zheng, Muhan, Zhang

PDF

Open Access

TL;DR

This paper presents LIFT, a novel fine-tuning framework that significantly enhances large language models' ability to understand and process long contexts efficiently, without extensive retraining.

Contribution

LIFT introduces a flexible, efficient method for improving long-context understanding in LLMs by adapting model parameters at test time, combining in-context learning and supervised fine-tuning.

Findings

01

LIFT improves performance on long-context benchmarks like LooGLE and LongBench.

02

LIFT enables short-context models to handle arbitrarily long inputs.

03

The framework offers a scalable solution without high computational costs.

Abstract

Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling, a novel framework that enhances LLM performance on long-context tasks by adapting model parameters to the context at test time. LIFT enables efficient processing of lengthy inputs without the computational burden of offline long-context adaptation, and can improve the long-context capabilities of arbitrary short-context models. The framework is further enhanced by integrating in-context learning and pre-LIFT supervised fine-tuning. The combination of in-context learning and LIFT enables short-context models like Llama 3 to handle arbitrarily long contexts and consistently improves their performance on popular long-context benchmarks like LooGLE and LongBench. We also provide a comprehensive analysis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems

MethodsLLaMA