Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces

Karan Gupta; Pranav Vajreshwari; Yash Pandya; Raghav Magazine; Akshay Nambi; Ahmed Awadallah

arXiv:2603.06713·cs.LG·March 10, 2026

Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces

Karan Gupta, Pranav Vajreshwari, Yash Pandya, Raghav Magazine, Akshay Nambi, Ahmed Awadallah

PDF

Open Access

TL;DR

This paper introduces ATLAS, a reinforcement finetuning framework that enhances small language models' ability to operate efficiently in large tool ecosystems by learning context management and action execution, achieving near-frontier performance.

Contribution

The paper presents a novel approach combining learnable context control with rubric-based reinforcement finetuning, enabling small models to effectively handle large toolspaces with limited resources.

Findings

01

Large gains over generic RL baselines on MCP benchmarks

02

A 4B parameter SLM approaches frontier-agent performance

03

Effective context management reduces context saturation issues

Abstract

Agentic systems operating over large tool ecosystems must plan and execute long-horizon workflows under weak or non-verifiable supervision. While frontier models mitigate these challenges through scale and large context budgets, small language models (SLMs) remain brittle: eager tool loading saturates context, execution errors compound over time, and sparse rewards limit learning. We introduce ATLAS, a reinforcement finetuning framework that enables SLMs to operate effectively in large-scale toolspace environments by learning how to acquire context and how to execute actions. Our approach makes two key contributions. First, we treat context control and execution structure as learnable decisions, combining iterative tool loading with programmatic tool orchestration to bound context growth and stabilize long-horizon trajectories. Second, we propose rubric-based reinforcement finetuning,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multi-Agent Systems and Negotiation · Machine Learning and Data Classification