Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Furkan Mumcu; Yasin Yilmaz

arXiv:2603.04378·cs.LG·March 5, 2026

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Furkan Mumcu, Yasin Yilmaz

PDF

Open Access

TL;DR

This paper introduces AAJR, a new Jacobian regularization method that improves the robustness of agentic AI systems by focusing on adversarial directions, reducing conservativeness and enhancing stability.

Contribution

The paper proposes AAJR, a trajectory-aligned Jacobian regularization technique that offers a less conservative, more effective way to enhance robustness in multi-agent AI systems.

Findings

01

AAJR yields a larger admissible policy class than global bounds.

02

AAJR controls smoothness along optimization trajectories, ensuring stability.

03

Theoretical analysis shows AAJR reduces robustness trade-offs.

Abstract

As Large Language Models (LLMs) transition into autonomous multi-agent ecosystems, robust minimax training becomes essential yet remains prone to instability when highly non-linear policies induce extreme local curvature in the inner maximization. Standard remedies that enforce global Jacobian bounds are overly conservative, suppressing sensitivity in all directions and inducing a large Price of Robustness. We introduce Adversarially-Aligned Jacobian Regularization (AAJR), a trajectory-aligned approach that controls sensitivity strictly along adversarial ascent directions. We prove that AAJR yields a strictly larger admissible policy class than global constraints under mild conditions, implying a weakly smaller approximation gap and reduced nominal performance degradation. Furthermore, we derive step-size conditions under which AAJR controls effective smoothness along optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques