FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents

Jiaqi Shao; Yufeng Miao; Wei Zhang; Bing Luo

arXiv:2512.22733·cs.LG·December 30, 2025

FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents

Jiaqi Shao, Yufeng Miao, Wei Zhang, Bing Luo

PDF

Open Access 1 Models

TL;DR

FoldAct introduces a novel framework for stable and efficient long-horizon reinforcement learning in large language models by addressing the challenges of context folding, resulting in improved training stability and speed.

Contribution

It proposes a new method with separated loss, context consistency, and segment training to tackle non-stationarity and computational issues in context folding for RL.

Findings

01

Achieves 5.19× training speedup.

02

Addresses non-stationary observation distribution.

03

Enables stable training of long-horizon search agents.

Abstract

Long-horizon reinforcement learning (RL) for large language models faces critical scalability challenges from unbounded context growth, leading to context folding methods that compress interaction history during task execution. However, existing approaches treat summary actions as standard actions, overlooking that summaries fundamentally modify the agent's future observation space, creating a policy-dependent, non-stationary observation distribution that violates core RL assumptions. This introduces three fundamental challenges: (1) gradient dilution where summary tokens receive insufficient training signal, (2) self-conditioning where policy updates change summary distributions, creating a vicious cycle of training collapse, and (3) computational cost from processing unique contexts at each turn. We introduce \textbf{FoldAct}\footnote{https://github.com/SHAO-Jiaqi757/FoldAct}, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Luuvy/foldact-7b-local-qwen-2.5-it
model· 2 dl
2 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications