Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models

Daniyal Ganiuly; Assel Smaiyl

arXiv:2511.01634·cs.CR·November 13, 2025

Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models

Daniyal Ganiuly, Assel Smaiyl

PDF

Open Access

TL;DR

This paper introduces a unified framework with three metrics to evaluate the robustness of large language models against prompt injection attacks, revealing that safety tuning enhances resilience more than size.

Contribution

It proposes a comprehensive evaluation framework for prompt injection resilience and demonstrates the importance of safety tuning over model size for robustness.

Findings

01

GPT-4 shows the highest resilience among models tested

02

Open-source models are more vulnerable to prompt injection

03

Safety tuning significantly improves model robustness

Abstract

Large Language Models (LLMs) are increasingly used in intelligent systems that perform reasoning, summarization, and code generation. Their ability to follow natural-language instructions, while powerful, also makes them vulnerable to a new class of attacks known as prompt injection. In these attacks, hidden or malicious instructions are inserted into user inputs or external content, causing the model to ignore its intended task or produce unsafe responses. This study proposes a unified framework for evaluating how resistant Large Language Models (LLMs) are to prompt injection attacks. The framework defines three complementary metrics such as the Resilience Degradation Index (RDI), Safety Compliance Coefficient (SCC), and Instructional Integrity Metric (IIM) to jointly measure robustness, safety, and semantic stability. We evaluated four instruction-tuned models (GPT-4, GPT-4o, LLaMA-3…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Artificial Intelligence in Healthcare and Education