AgenTRIM: Tool Risk Mitigation for Agentic AI

Roy Betser; Shamik Bose; Amit Giloni; Chiara Picardi; Sindhu Padakandla; Roman Vainshtein

arXiv:2601.12449·cs.CR·January 21, 2026

AgenTRIM: Tool Risk Mitigation for Agentic AI

Roy Betser, Shamik Bose, Amit Giloni, Chiara Picardi, Sindhu Padakandla, Roman Vainshtein

PDF

Open Access

TL;DR

AgenTRIM is a framework designed to detect and mitigate security risks in autonomous AI agents that use external tools, ensuring safer tool permissions without compromising task performance.

Contribution

It introduces a novel approach to identify and reduce tool-driven agency risks in AI agents through offline verification and online permission enforcement.

Findings

01

Significantly reduces attack success in AgentDojo benchmark

02

Maintains high task performance despite risk mitigation

03

Robust against description-based attacks and enforces safety policies

Abstract

AI agents are autonomous systems that combine LLMs with external tools to solve complex tasks. While such tools extend capability, improper tool permissions introduce security risks such as indirect prompt injection and tool misuse. We characterize these failures as unbalanced tool-driven agency. Agents may retain unnecessary permissions (excessive agency) or fail to invoke required tools (insufficient agency), amplifying the attack surface and reducing performance. We introduce AgenTRIM, a framework for detecting and mitigating tool-driven agency risks without altering an agent's internal reasoning. AgenTRIM addresses these risks through complementary offline and online phases. Offline, AgenTRIM reconstructs and verifies the agent's tool interface from code and execution traces. At runtime, it enforces per-step least-privilege tool access through adaptive filtering and status-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Information and Cyber Security