The Mirror Design Pattern: Strict Data Geometry over Model Scale for Prompt Injection Detection
J Alex Corll

TL;DR
This paper introduces Mirror, a data-curation pattern for prompt injection detection that uses strict data geometry to achieve fast, deterministic, and effective screening with a simple linear model, outperforming larger neural models in specific metrics.
Contribution
The paper presents a novel data-curation design pattern called Mirror, which organizes prompt injection data into a structured topology to improve detection efficiency and effectiveness.
Findings
Mirror achieves 95.97% recall and 92.07% F1 on prompt injection detection.
A simple linear SVM with curated data outperforms a large neural model in recall and F1.
Strict data geometry can be more effective than model scale for prompt injection screening.
Abstract
Prompt injection defenses are often framed as semantic understanding problems and delegated to increasingly large neural detectors. For the first screening layer, however, the requirements are different: the detector runs on every request and therefore must be fast, deterministic, non-promptable, and auditable. We introduce Mirror, a data-curation design pattern that organizes prompt injection corpora into matched positive and negative cells so that a classifier learns control-plane attack mechanics rather than incidental corpus shortcuts. Using 5,000 strictly curated open-source samples -- the largest corpus supportable under our public-data validity contract -- we define a 32-cell mirror topology, fill 31 of those cells with public data, train a sparse character n-gram linear SVM, compile its weights into a static Rust artifact, and obtain 95.97\% recall and 92.07\% F1 on a 524-case…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
