Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces

Angshul Majumdar

arXiv:2601.08271·cs.AI·January 14, 2026

Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces

Angshul Majumdar

PDF

Open Access

TL;DR

This paper introduces Sparse Agentic Control (SAC), a framework for understanding and learning stable policies in large action spaces for agentic LLMs, emphasizing the importance of sparsity for sample efficiency and stability.

Contribution

The paper formalizes SAC, providing theoretical guarantees for sparse policy learning, support recovery, and stability in large action spaces, with extensions to various settings.

Findings

01

Estimation error scales as k (log M / T)^{1/2}.

02

Exact support recovery when T > k log M.

03

Dense policies require Omega(M) samples, indicating instability.

Abstract

Tool-augmented LLM systems expose a control regime that learning theory has largely ignored: sequential decision-making with a massive discrete action universe (tools, APIs, documents) in which only a small, unknown subset is relevant for any fixed task distribution. We formalize this setting as Sparse Agentic Control (SAC), where policies admit block-sparse representations over M >> 1 actions and rewards depend on sparse main effects and (optionally) sparse synergies. We study ell_{1,2}-regularized policy learning through a convex surrogate and establish sharp, compressed-sensing-style results: (i) estimation and value suboptimality scale as k (log M / T)^{1/2} under a Policy-RSC condition; (ii) exact tool-support recovery holds via primal-dual witness arguments when T > k log M under incoherence and beta-min; and (iii) any dense policy class requires Omega(M) samples, explaining the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Adaptive Dynamic Programming Control