KLong: Training LLM Agent for Extremely Long-horizon Tasks

Yue Liu; Yingwei Ma; Yibo Miao; Yanhao Li; Yuchong Xie; Xinlong Yang; Zhiyuan Hu; Flood Sung; Jiaheng Zhang; Bryan Hooi

arXiv:2602.17547·cs.AI·April 28, 2026

KLong: Training LLM Agent for Extremely Long-horizon Tasks

Yue Liu, Yingwei Ma, Yibo Miao, Yanhao Li, Yuchong Xie, Xinlong Yang, Zhiyuan Hu, Flood Sung, Jiaheng Zhang, Bryan Hooi

PDF

1 Repo

TL;DR

KLong is an open-source LLM agent designed for extremely long-horizon tasks, utilizing trajectory-splitting SFT and progressive RL to enhance long-term problem-solving capabilities.

Contribution

The paper introduces a novel training pipeline combining trajectory-splitting SFT and progressive RL, enabling LLMs to better handle very long tasks.

Findings

01

KLong (106B) outperforms Kimi K2 Thinking (1T) by 11.28% on PaperBench.

02

KLong demonstrates superior performance and generalization on multiple long-horizon benchmarks.

03

The proposed methods effectively preserve context and extend task-solving capabilities.

Abstract

This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via trajectory-splitting SFT, then scale it via progressive RL training. Specifically, we first activate basic agentic abilities of a base model with a comprehensive SFT recipe. Then, we introduce Research-Factory, an automated pipeline that generates high-quality training data by collecting research papers and constructing evaluation rubrics. Using this pipeline, we build thousands of long-horizon trajectories distilled from Claude 4.5 Sonnet (Thinking). To train with these extremely long trajectories, we propose a new trajectory-splitting SFT, which preserves early context, progressively truncates later context, and maintains overlap between sub-trajectories. In addition, to further improve long-horizon task-solving capability, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nsr9/klong
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.