MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs

Junhyeok Lee; Han Jang; and Kyu Sung Choi

arXiv:2602.06268·cs.CL·February 9, 2026

MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs

Junhyeok Lee, Han Jang, and Kyu Sung Choi

PDF

Open Access 1 Datasets

TL;DR

This paper introduces MPIB, a comprehensive benchmark dataset designed to evaluate the safety of Large Language Models in clinical settings against prompt injection attacks, emphasizing real-world clinical harm risks.

Contribution

MPIB is the first benchmark to evaluate clinical safety of LLMs against prompt injection, incorporating outcome-level risk metrics and diverse attack scenarios.

Findings

01

ASR and CHER can diverge significantly in evaluations.

02

Robustness varies depending on whether adversarial prompts are in user query or retrieved context.

03

Evaluation reveals critical vulnerabilities in current LLM defenses.

Abstract

Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems are increasingly integrated into clinical workflows; however, prompt injection attacks can steer these systems toward clinically unsafe or misleading outputs. We introduce the Medical Prompt Injection Benchmark (MPIB), a dataset-and-benchmark suite for evaluating clinical safety under both direct prompt injection and indirect, RAG-mediated injection across clinically grounded tasks. MPIB emphasizes outcome-level risk via the Clinical Harm Event Rate (CHER), which measures high-severity clinical harm events under a clinically grounded taxonomy, and reports CHER alongside Attack Success Rate (ASR) to disentangle instruction compliance from downstream patient risk. The benchmark comprises 9,697 curated instances constructed through multi-stage quality gates and clinical safety linting. Evaluating MPIB across a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

jhlee0619/mpib
dataset· 22 dl
22 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare