Blue Teaming Function-Calling Agents

Greta Dolcetti; Giulio Zizzo; Sergio Maffeis

arXiv:2601.09292·cs.CR·January 15, 2026

Blue Teaming Function-Calling Agents

Greta Dolcetti, Giulio Zizzo, Sergio Maffeis

PDF

Open Access

TL;DR

This paper evaluates the robustness of open source LLMs with function-calling capabilities against attacks and defenses, revealing current safety limitations and the impracticality of existing defenses in real-world applications.

Contribution

It provides an experimental assessment of the security vulnerabilities of function-calling LLMs and evaluates the effectiveness of various defenses.

Findings

01

Models are unsafe by default

02

Defenses are not ready for real-world use

03

Significant security gaps remain in current LLMs

Abstract

We present an experimental evaluation that assesses the robustness of four open source LLMs claiming function-calling capabilities against three different attacks, and we measure the effectiveness of eight different defences. Our results show how these models are not safe by default, and how the defences are not yet employable in real-world scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecurity and Verification in Computing · Advanced Malware Detection Techniques · Web Application Security Vulnerabilities