Clinician input steers frontier AI models toward both accurate and harmful decisions

Ivan Lopez; Selin S. Everett; Bryan J. Bunning; April S. Liang; Dong Han Yao; Shivam C. Vedak; Kameron C. Black; Sophie Ostmeier; Stephen P. Ma; Emily Alsentzer; Jonathan H. Chen; Akshay S. Chaudhari; Eric Horvitz

arXiv:2603.14158·cs.HC·March 17, 2026

Clinician input steers frontier AI models toward both accurate and harmful decisions

Ivan Lopez, Selin S. Everett, Bryan J. Bunning, April S. Liang, Dong Han Yao, Shivam C. Vedak, Kameron C. Black, Sophie Ostmeier, Stephen P. Ma, Emily Alsentzer, Jonathan H. Chen, Akshay S. Chaudhari, Eric Horvitz

PDF

Open Access

TL;DR

This study evaluates how clinician input influences large language models in clinical settings, revealing improvements in diagnostic accuracy but also vulnerabilities to adversarial manipulation, emphasizing the need for safety measures.

Contribution

The paper introduces a comprehensive framework for assessing clinician-AI interactions, including new metrics and mitigation strategies to enhance safety and robustness of LLMs in healthcare.

Findings

01

Clinician input significantly improves model diagnostic accuracy.

02

Adversarial contexts can degrade model performance and induce harmful echoing.

03

Scaling inference and explicit uncertainty signals mitigate some risks.

Abstract

Large language models (LLMs) are entering clinician workflows, yet evaluations rarely measure how clinician reasoning shapes model behavior during clinical interactions. We combined 61 New England Journal of Medicine Case Records with 92 real-world clinician-AI interactions to evaluate 21 reasoning LLM variants across 8 frontier models on differential diagnosis generation and next step recommendations under three conditions: reasoning alone, after expert clinician context, and after adversarial clinician context. LLM-clinician concordance increased substantially after clinician exposure, with simulations sharing >=3 differential diagnosis items rising from 65.8% to 93.5% and >=3 next step recommendations from 20.3% to 53.8%. Expert context significantly improved correct final diagnosis inclusion across all 21 models (mean +20.4 percentage points), reflecting both reasoning improvement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare