"Do I Trust the AI?" Towards Trustworthy AI-Assisted Diagnosis: Understanding User Perception in LLM-Supported Reasoning
Yuansong Xu, Yichao Zhu, Haokai Wang, Yuchen Wu, Yang Ouyang, Hanlu Li, Wenzhe Zhou, Xinyu Liu, Chang Jiang, and Quan Li

TL;DR
This paper explores physicians' perceptions of LLMs in clinical reasoning, revealing gaps between perceived and benchmarked capabilities, and discusses how to improve trust and collaboration in AI-assisted diagnosis.
Contribution
It provides empirical insights into physicians' perceptions of LLMs' clinical reasoning, highlighting evaluation gaps and proposing ways to enhance trustworthy AI-human collaboration.
Findings
Physicians value certain aspects of clinical reasoning in LLMs.
Perceived LLM capabilities often differ from benchmark performance.
Identifies opportunities to improve trust in AI-assisted diagnosis.
Abstract
Large language models (LLMs) have shown considerable potential in supporting medical diagnosis. However, their effective integration into clinical workflows is hindered by physicians' difficulties in perceiving and trusting LLM capabilities, which often results in miscalibrated trust. Existing model evaluations primarily emphasize standardized benchmarks and predefined tasks, offering limited insights into clinical reasoning practices. Moreover, research on human-AI collaboration has rarely examined physicians' perceptions of LLMs' clinical reasoning capability. In this work, we investigate how physicians perceive LLMs' capabilities in the clinical reasoning process. We designed clinical cases, collected the corresponding analyses, and obtained evaluations from physicians (N=37) to quantitatively represent their perceived LLM diagnostic capabilities. By comparing the perceived…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Clinical Reasoning and Diagnostic Skills
