Evaluating Prompt Injection Defenses for Educational LLM Tutors: Security-Usability-Latency Trade-offs

Alexandre Cristov\~ao Maiorano

arXiv:2605.06669·cs.CR·May 22, 2026

Evaluating Prompt Injection Defenses for Educational LLM Tutors: Security-Usability-Latency Trade-offs

Alexandre Cristov\~ao Maiorano

PDF

TL;DR

This paper evaluates prompt-injection defenses for educational language models, analyzing security, usability, and latency trade-offs to guide guardrail selection in AI tutoring systems.

Contribution

It introduces a comprehensive evaluation methodology and benchmark protocol for comparing prompt-injection defenses in educational LLM tutors.

Findings

01

The multi-layer safeguard pipeline achieves low bypass and false positive rates with optimized latency.

02

NeMo Guardrails reach 0% bypass at 16.22% FPR and ~1.5s latency.

03

Prompt Guard yields 38.48% bypass with 3.60% FPR.

Abstract

Educational LLM tutors face a core AI alignment challenge: they must follow user intent while preserving pedagogical constraints and safety policies. We present an evaluation methodology for prompt-injection defenses in this setting, showing that guardrail design entails explicit trade-offs among adversarial robustness, benign-task usability, and response latency. We evaluate a domain-specific multi-layer safeguard pipeline combining deterministic pattern filters, structural validation, contextual sandboxing, and session-level behavioral checks. On a controlled holdout benchmark, the pipeline reaches low bypass and false positive rates with optimized average latency - an operating point that prioritizes pedagogical usability (zero false positives) while maintaining measurable attack resistance. We provide a reproducible benchmark protocol for head-to-head comparison under identical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.