Critique of Impure Reason: Unveiling the reasoning behaviour of medical Large Language Models
Shamus Sim, Tyrone Chen

TL;DR
This paper critically examines the reasoning behavior of medical Large Language Models, emphasizing the need for explainability to enhance trust and integration in healthcare, and proposes frameworks for understanding their reasoning processes.
Contribution
It introduces a conceptual framework for analyzing reasoning in medical LLMs and surveys current approaches, highlighting open challenges for developing transparent models.
Findings
Survey of state-of-the-art reasoning approaches in medical LLMs
Proposed theoretical frameworks for interpretability
Identification of key open challenges in model development
Abstract
Background: Despite the current ubiquity of Large Language Models (LLMs) across the medical domain, there is a surprising lack of studies which address their reasoning behaviour. We emphasise the importance of understanding reasoning behaviour as opposed to high-level prediction accuracies, since it is equivalent to explainable AI (XAI) in this context. In particular, achieving XAI in medical LLMs used in the clinical domain will have a significant impact across the healthcare sector. Results: Therefore, in this work, we adapt the existing concept of reasoning behaviour and articulate its interpretation within the specific context of medical LLMs. We survey and categorise current state-of-the-art approaches for modeling and evaluating reasoning reasoning in medical LLMs. Additionally, we propose theoretical frameworks which can empower medical professionals or machine learning engineers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
