SPML: A DSL for Defending Language Models Against Prompt Attacks
Reshabh K Sharma, Vinayak Gupta, Dan Grossman

TL;DR
This paper introduces SPML, a domain-specific language designed to defend language model chatbots from prompt-based attacks by refining prompts, monitoring inputs, and providing a benchmark for evaluating chatbot safety and robustness.
Contribution
SPML offers a novel language for refining and monitoring chatbot prompts, along with a benchmark dataset, enhancing security and ease of chatbot definition design.
Findings
SPML effectively detects attacker prompts, surpassing GPT-4, GPT-3.5, and LLAMA.
The benchmark dataset enables comprehensive evaluation of chatbot defenses.
SPML streamlines chatbot creation with programming language features.
Abstract
Large language models (LLMs) have profoundly transformed natural language applications, with a growing reliance on instruction-based definitions for designing chatbots. However, post-deployment the chatbot definitions are fixed and are vulnerable to attacks by malicious users, emphasizing the need to prevent unethical applications and financial losses. Existing studies explore user prompts' impact on LLM-based chatbots, yet practical methods to contain attacks on application-specific chatbots remain unexplored. This paper presents System Prompt Meta Language (SPML), a domain-specific language for refining prompts and monitoring the inputs to the LLM-based chatbots. SPML actively checks attack prompts, ensuring user inputs align with chatbot definitions to prevent malicious execution on the LLM backbone, optimizing costs. It also streamlines chatbot definition crafting with programming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Label Smoothing · Linear Layer · Absolute Position Encodings · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Transformer · Dense Connections · Cosine Annealing
