Proactive Hardening of LLM Defenses with HASTE

Henry Chen; Victor Aranda; Samarth Keshari; Ryan Heartfield; Nicole Nichols

arXiv:2601.19051·cs.CR·January 28, 2026

Proactive Hardening of LLM Defenses with HASTE

Henry Chen, Victor Aranda, Samarth Keshari, Ryan Heartfield, Nicole Nichols

PDF

Open Access

TL;DR

HASTE is a systematic framework that proactively and reactively enhances LLM defenses by generating adaptive attack prompts, significantly improving prompt detection and hardening strategies against prompt-based attacks.

Contribution

The paper introduces HASTE, a modular framework for generating evasive prompts to improve prompt-based attack detection and defense in LLMs, with demonstrated effectiveness.

Findings

01

Reduces malicious prompt detection by approximately 64% with hard negative mining.

02

Optimizes prompt detection models with fewer iteration loops.

03

Supports both proactive stress-testing and reactive attack modeling.

Abstract

Prompt-based attack techniques are one of the primary challenges in securely deploying and protecting LLM-based AI systems. LLM inputs are an unbounded, unstructured space. Consequently, effectively defending against these attacks requires proactive hardening strategies capable of continuously generating adaptive attack vectors to optimize LLM defense at runtime. We present HASTE (Hard-negative Attack Sample Training Engine): a systematic framework that iteratively engineers highly evasive prompts, within a modular optimization process, to continuously enhance detection efficacy for prompt-based attack techniques. The framework is agnostic to synthetic data generation methods, and can be generalized to evaluate prompt-injection detection efficacy, with and without fuzzing, for any hard-negative or hard-positive iteration strategy. Experimental evaluation of HASTE shows that hard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Network Security and Intrusion Detection · Security and Verification in Computing