ACC: Compiling Agent Trajectories for Long-Context Training
Qisheng Su, Zhen Fang, Shiting Huang, Yu Zeng, Yiming Zhao, Kou Shi, Ziao Zhang, Lin Chen, Zehui Chen, Lijun Wu, Feng Zhao

TL;DR
This paper introduces Agent Context Compilation (ACC), a method to convert agent trajectories into long-context QA pairs, enabling models to learn long-range reasoning without additional annotations.
Contribution
ACC transforms agent trajectories into training data for long-context reasoning, improving model performance on dependency tasks without extra supervision.
Findings
Training with ACC improves performance on MRCR and GraphWalks benchmarks.
ACC enables models to answer questions directly from scattered evidence across turns.
The approach maintains general capabilities on other diverse tasks.
Abstract
Recent development of agents has renewed demand for long-context reasoning capacity of LLMs. However, training LLMs for this capacity requires costly long-document curation or heuristic context synthesis. We observe that agents produce massive trajectories when solving problems, invoking tools and receiving environment observations across many turns. The evidence needed to answer the original question is thus scattered throughout these turns, requiring integration of distant context segments. Nevertheless, standard agent SFT masks tool responses and only trains turn-level tool selection, creating a supervision blind spot where these scattered signals go unused. We propose Agent Context Compilation (ACC), which converts trajectories from search, software engineering, and database querying agents into long-context QA pairs that combine the original question with tool responses and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
