STARS: Skill-Triggered Audit for Request-Conditioned Invocation Safety in Agent Systems

Guijia Zhang; Shu Yang; Xilin Gong; and Di Wang

arXiv:2604.10286·cs.AI·April 14, 2026

STARS: Skill-Triggered Audit for Request-Conditioned Invocation Safety in Agent Systems

Guijia Zhang, Shu Yang, Xilin Gong, and Di Wang

PDF

1 Repo

TL;DR

STARS introduces a method for continuous risk estimation of skill invocations in autonomous agents, combining static and dynamic analysis to improve safety and triage decisions.

Contribution

The paper presents STARS, a novel approach integrating static priors and request-conditioned risk models for real-time invocation safety assessment.

Findings

01

Calibrated fusion achieves 0.439 high-risk AUPRC on attack detection.

02

Contextual scorer outperforms static baseline in risk calibration.

03

Request-conditioned auditing is most effective as an invocation-time risk layer.

Abstract

Autonomous language-model agents increasingly rely on installable skills and tools to complete user tasks. Static skill auditing can expose capability surface before deployment, but it cannot determine whether a particular invocation is unsafe under the current user request and runtime context. We therefore study skill invocation auditing as a continuous-risk estimation problem: given a user request, candidate skill, and runtime context, predict a score that supports ranking and triage before a hard intervention is applied. We introduce STARS, which combines a static capability prior, a request-conditioned invocation risk model, and a calibrated risk-fusion policy. To evaluate this setting, we construct SIA-Bench, a benchmark of 3,000 invocation records with group-safe splits, lineage metadata, runtime context, canonical action labels, and derived continuous-risk targets. On a held-out…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

123zgj123/STARS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.