OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan, Xiaogeng Liu, Chaowei Xiao

TL;DR
OET is a comprehensive, optimization-based toolkit designed to systematically evaluate and benchmark the robustness of large language models against prompt injection attacks, highlighting vulnerabilities even in defended models.
Contribution
We introduce OET, a modular, adaptive evaluation framework that rigorously tests LLMs against prompt injection attacks using optimization techniques in both white-box and black-box scenarios.
Findings
Current defenses are often ineffective against adaptive attacks
OET reveals vulnerabilities in models despite security measures
The toolkit enables strict red-teaming evaluations
Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation, enabling their widespread adoption across various domains. However, their susceptibility to prompt injection attacks poses significant security risks, as adversarial inputs can manipulate model behavior and override intended instructions. Despite numerous defense strategies, a standardized framework to rigorously evaluate their effectiveness, especially under adaptive adversarial scenarios, is lacking. To address this gap, we introduce OET, an optimization-based evaluation toolkit that systematically benchmarks prompt injection attacks and defenses across diverse datasets using an adaptive testing framework. Our toolkit features a modular workflow that facilitates adversarial string generation, dynamic attack execution, and comprehensive result analysis, offering a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-time simulation and control systems · Software Testing and Debugging Techniques · VLSI and Analog Circuit Testing
