GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrej\'on; Alfonso S\'anchez-Maci\'an

arXiv:2604.24477·cs.CR·April 29, 2026

GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrej\'on, Alfonso S\'anchez-Maci\'an

PDF

TL;DR

Gammaf is an open-source benchmarking platform designed to evaluate graph-based anomaly detection methods in LLM multi-agent systems, providing synthetic datasets and performance assessment tools.

Contribution

It introduces a standardized, reproducible framework for benchmarking defense models against anomalies in LLM multi-agent systems, filling a critical gap in the field.

Findings

01

Gammaf demonstrates high utility and scalability in evaluating defense models.

02

Effective attack remediation improves system integrity and reduces operational costs.

03

Benchmarking with XG-Guard and BlindGuard shows the framework's effectiveness.

Abstract

The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving capabilities, but it has also expanded their attack surfaces, exposing them to vulnerabilities such as prompt infection and compromised inter-agent communication. While emerging graph-based anomaly detection methods show promise in protecting these networks, the field currently lacks a standardized, reproducible environment to train these models and evaluate their efficacy. To address this gap, we introduce Gammaf (Graph-based Anomaly Monitoring for LLM Multi-Agent systems Framework), an open-source benchmarking platform. Gammaf is not a novel defense mechanism itself, but rather a comprehensive evaluation architecture designed to generate synthetic multi-agent interaction datasets and benchmark the performance of existing and future defense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.