Palisade -- Prompt Injection Detection Framework

Sahasra Kokkula; Somanathan R; Nandavardhan R; Aashishkumar; G Divya

arXiv:2410.21146·cs.CL·October 29, 2024

Palisade -- Prompt Injection Detection Framework

Sahasra Kokkula, Somanathan R, Nandavardhan R, Aashishkumar, G Divya

PDF

Open Access

TL;DR

Palisade introduces a layered NLP-based framework combining rule-based filtering, ML classification, and LLM analysis to detect prompt injection attacks, improving security in AI systems.

Contribution

The paper presents a novel multi-layer detection framework that enhances prompt injection detection accuracy over traditional static methods.

Findings

01

ML classifier achieves highest individual accuracy

02

Multi-layer approach reduces false negatives significantly

03

Framework prioritizes security despite increased false positives

Abstract

The advent of Large Language Models LLMs marks a milestone in Artificial Intelligence, altering how machines comprehend and generate human language. However, LLMs are vulnerable to malicious prompt injection attacks, where crafted inputs manipulate the models behavior in unintended ways, compromising system integrity and causing incorrect outcomes. Conventional detection methods rely on static, rule-based approaches, which often fail against sophisticated threats like abnormal token sequences and alias substitutions, leading to limited adaptability and higher rates of false positives and false negatives.This paper proposes a novel NLP based approach for prompt injection detection, emphasizing accuracy and optimization through a layered input screening process. In this framework, prompts are filtered through three distinct layers rule-based, ML classifier, and companion LLM before…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEpilepsy research and treatment · Neurological disorders and treatments · Cardiovascular Syncope and Autonomic Disorders