On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length
Sunghwan Kim, Junhee Cho, Beong-woo Kwak, Taeyoon Kwon, Liang Wang, Nan Yang, Xingxing Zhang, Furu Wei, Jinyoung Yeo

TL;DR
This empirical study investigates how increasing task horizon length affects training stability of large language models, revealing that horizon reduction improves training and enhances generalization across different horizon lengths.
Contribution
The paper systematically examines the impact of horizon length on LLM training, highlighting horizon reduction as a key method to improve stability and generalization in long-horizon tasks.
Findings
Increasing horizon length causes training instability due to exploration and credit assignment issues.
Horizon reduction stabilizes training and improves performance on long-horizon tasks.
Models trained with reduced horizons generalize better to longer horizons at inference.
Abstract
Large language models (LLMs) have shown promise as interactive agents that solve tasks through extended sequences of environment interactions. While prior work has primarily focused on system-level optimizations or algorithmic improvements, the role of task horizon length in shaping training dynamics remains poorly understood. In this work, we present a systematic empirical study that examines horizon length through controlled task constructions. Specifically, we construct controlled tasks in which agents face identical decision rules and reasoning structures, but differ only in the length of action sequences required for successful completion. Our results reveal that increasing horizon length alone constitutes a training bottleneck, inducing severe training instability driven by exploration difficulties and credit assignment challenges. We demonstrate that horizon reduction is a key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
