ASTER: Agentic Scaling with Tool-integrated Extended Reasoning
Xuqin Zhang, Quan He, Zhenrui Zheng, Zongzhang Zhang, Xu He, Dong Li

TL;DR
This paper introduces ASTER, a framework that improves tool-integrated reasoning in large language models by using a cold-start strategy with interaction-dense trajectories, leading to state-of-the-art performance on mathematical benchmarks.
Contribution
ASTER presents a novel cold-start approach that mitigates interaction collapse, enabling better exploration and generalization in RL-based tool reasoning for LLMs.
Findings
A small set of 4K interaction-dense trajectories improves downstream performance.
ASTER-4B surpasses existing models on mathematical benchmarks.
Cold-start strategies enhance exploration and RL outcomes.
Abstract
Reinforcement learning (RL) has emerged as a dominant paradigm for eliciting long-horizon reasoning in Large Language Models (LLMs). However, scaling Tool-Integrated Reasoning (TIR) via RL remains challenging due to interaction collapse: a pathological state where models fail to sustain multi-turn tool usage, instead degenerating into heavy internal reasoning with only trivial, post-hoc code verification. We systematically study three questions: (i) how cold-start SFT induces an agentic, tool-using behavioral prior, (ii) how the interaction density of cold-start trajectories shapes exploration and downstream RL outcomes, and (iii) how the RL interaction budget affects learning dynamics and generalization under varying inference-time budgets. We then introduce ASTER (Agentic Scaling with Tool-integrated Extended Reasoning), a framework that circumvents this collapse through a targeted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
