Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
Diego Gosmar, Deborah A. Dahl

TL;DR
This paper presents a new evaluation framework and a multi-agent system that enhances prompt injection mitigation in large language models, improving security, efficiency, and sustainability without altering core models.
Contribution
It introduces TIVS-O, an extended evaluation metric incorporating semantic caching and observability, and demonstrates a multi-agent architecture that achieves secure, efficient, and environmentally sustainable LLM deployment.
Findings
Zero high-risk breaches achieved in experiments.
Semantic caching reduces LLM calls by 41.6%.
Trade-offs between mitigation strictness and transparency are identified.
Abstract
Prompt injection remains a central obstacle to the safe deployment of large language models, particularly in multi-agent settings where intermediate outputs can propagate or amplify malicious instructions. Building on earlier work that introduced a four-metric Total Injection Vulnerability Score (TIVS), this paper extends the evaluation framework with semantic similarity-based caching and a fifth metric (Observability Score Ratio) to yield TIVS-O, investigating how defence effectiveness interacts with transparency in a HOPE-inspired Nested Learning architecture. The proposed system combines an agentic pipeline with Continuum Memory Systems that implement semantic similarity-based caching across 301 synthetically generated injection-focused prompts drawn from ten attack families, while a fourth agent performs comprehensive security analysis using five key performance indicators. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Network Security and Intrusion Detection
