Detection Method for Prompt Injection by Integrating Pre-trained Model and Heuristic Feature Engineering

Yi Ji; Runzhi Li; Baolei Mao

arXiv:2506.06384·cs.CL·June 10, 2025

Detection Method for Prompt Injection by Integrating Pre-trained Model and Heuristic Feature Engineering

Yi Ji, Runzhi Li, Baolei Mao

PDF

Open Access

TL;DR

This paper introduces DMPI-PMHFE, a dual-channel detection framework combining a pretrained language model and heuristic features to effectively identify prompt injection attacks across various LLMs, improving security and robustness.

Contribution

The paper presents a novel dual-channel detection method that integrates semantic and structural features, enhancing prompt injection attack detection across multiple LLMs.

Findings

01

Outperforms existing detection methods in accuracy, recall, and F1-score.

02

Reduces attack success rates significantly across mainstream LLMs.

03

Demonstrates effectiveness on diverse benchmark datasets.

Abstract

With the widespread adoption of Large Language Models (LLMs), prompt injection attacks have emerged as a significant security threat. Existing defense mechanisms often face critical trade-offs between effectiveness and generalizability. This highlights the urgent need for efficient prompt injection detection methods that are applicable across a wide range of LLMs. To address this challenge, we propose DMPI-PMHFE, a dual-channel feature fusion detection framework. It integrates a pretrained language model with heuristic feature engineering to detect prompt injection attacks. Specifically, the framework employs DeBERTa-v3-base as a feature extractor to transform input text into semantic vectors enriched with contextual information. In parallel, we design heuristic rules based on known attack patterns to extract explicit structural features commonly observed in attacks. Features from both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Network Security and Intrusion Detection