Bridging AI and Software Security: A Comparative Vulnerability Assessment of LLM Agent Deployment Paradigms

Tarek Gasmi; Ramzi Guesmi; Ines Belhadj; and Jihene Bennaceur

arXiv:2507.06323·cs.CR·July 10, 2025

Bridging AI and Software Security: A Comparative Vulnerability Assessment of LLM Agent Deployment Paradigms

Tarek Gasmi, Ramzi Guesmi, Ines Belhadj, and Jihene Bennaceur

PDF

Open Access

TL;DR

This study compares security vulnerabilities of two LLM agent deployment paradigms, revealing how architectural choices influence threat exposure and providing a foundation for secure AI system deployment.

Contribution

It introduces a unified threat classification framework and conducts a comprehensive evaluation of attack scenarios across paradigms, highlighting their distinct vulnerability profiles.

Findings

01

Function Calling has higher overall attack success rates (73.5%) than MCP (62.59%)

02

Chained attacks achieve 91-96% success rates, showing increased effectiveness with complexity

03

Advanced reasoning models are more exploitable despite better threat detection capabilities

Abstract

Large Language Model (LLM) agents face security vulnerabilities spanning AI-specific and traditional software domains, yet current research addresses these separately. This study bridges this gap through comparative evaluation of Function Calling architecture and Model Context Protocol (MCP) deployment paradigms using a unified threat classification framework. We tested 3,250 attack scenarios across seven language models, evaluating simple, composed, and chained attacks targeting both AI-specific threats (prompt injection) and software vulnerabilities (JSON injection, denial-of-service). Function Calling showed higher overall attack success rates (73.5% vs 62.59% for MCP), with greater system-centric vulnerability while MCP exhibited increased LLM-centric exposure. Attack complexity dramatically amplified effectiveness, with chained attacks achieving 91-96% success rates.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Information and Cyber Security · Topic Modeling