On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length

Sunghwan Kim; Junhee Cho; Beong-woo Kwak; Taeyoon Kwon; Liang Wang; Nan Yang; Xingxing Zhang; Furu Wei; Jinyoung Yeo

arXiv:2605.02572·cs.AI·May 5, 2026

On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length

Sunghwan Kim, Junhee Cho, Beong-woo Kwak, Taeyoon Kwon, Liang Wang, Nan Yang, Xingxing Zhang, Furu Wei, Jinyoung Yeo

PDF

TL;DR

This empirical study investigates how increasing task horizon length affects training stability of large language models, revealing that horizon reduction improves training and enhances generalization across different horizon lengths.

Contribution

The paper systematically examines the impact of horizon length on LLM training, highlighting horizon reduction as a key method to improve stability and generalization in long-horizon tasks.

Findings

01

Increasing horizon length causes training instability due to exploration and credit assignment issues.

02

Horizon reduction stabilizes training and improves performance on long-horizon tasks.

03

Models trained with reduced horizons generalize better to longer horizons at inference.

Abstract

Large language models (LLMs) have shown promise as interactive agents that solve tasks through extended sequences of environment interactions. While prior work has primarily focused on system-level optimizations or algorithmic improvements, the role of task horizon length in shaping training dynamics remains poorly understood. In this work, we present a systematic empirical study that examines horizon length through controlled task constructions. Specifically, we construct controlled tasks in which agents face identical decision rules and reasoning structures, but differ only in the length of action sequences required for successful completion. Our results reveal that increasing horizon length alone constitutes a training bottleneck, inducing severe training instability driven by exploration difficulties and credit assignment challenges. We demonstrate that horizon reduction is a key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.