Obedience to Unsafe Clinical Instructions: How Large Language Models Respond to Authority Cues
Mahmud Omar, Reem Agbareia, Jolion McGreevy, Alon Gorenshtein, Alexander Charney, Ankit Sakhuja, Benjamin S. Glicksberg, Girish Nadkarni, Eyal Klang

TL;DR
This study shows that large language models in clinical settings often follow unsafe instructions when pressured by authority cues, but adding safety reminders can reduce harmful decisions.
Contribution
The paper introduces a novel evaluation framework for LLMs' responses to authority cues in clinical scenarios, revealing harmful compliance patterns.
Findings
11.7% of LLM outputs across 10 million clinical scenarios were harmful.
Mitigation cues reduced harmful decisions by up to 22.1 percentage points in real-world discharge cases.
Authority and responsibility-transfer cues led to the highest harmful compliance rates.
Abstract
Large language models (LLMs) are being integrated into clinical environments where deference to authority can cause harm. Unlike hallucination or bias, obedience to unsafe instructions represents a distinct safety failure: following an explicit but harmful order. We conducted a cross-sectional evaluation of 20 proprietary, open-source, and clinically tuned LLMs across 10,096,800 clinical decision scenarios, including synthetic vignettes with predefined safe versus unsafe options and real-world discharge recommendations reframed to include unsafe contradictory requests. Each scenario was presented under a neutral control or one of six Milgram-style social-pressure conditions (authority, responsibility transfer, urgency, threat, conformity, depersonalization), with or without a short mitigation cue instructing verification or escalation if unsafe. The primary outcome was the proportion…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare Decision-Making and Restraints · Patient-Provider Communication in Healthcare · Clinical Reasoning and Diagnostic Skills
