Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun, Alexander Schubert, Gregory M. Goldgof, Zhiqing Sun,, Thomas Hartvigsen, Atul J. Butte, Ahmed Alaa

TL;DR
This paper introduces Dr-LLaVA, a visual language model for medical diagnosis that uses symbolic clinical reasoning to improve accuracy and consistency in multi-turn medical conversations, reducing reliance on human-labeled data.
Contribution
The paper presents a novel alignment algorithm leveraging symbolic clinical reasoning for training VLMs, enabling scalable, cost-effective medical dialogue systems.
Findings
Dr-LLaVA performs well in multi-turn medical conversations.
The alignment algorithm reduces need for human-labeled data.
Model shows strong diagnostic reasoning capabilities.
Abstract
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions to assist in diagnostic and treatment tasks. However, VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information. This challenge is particularly pronounced in the medical domain, where we do not only require VLM outputs to be accurate in single interactions but also to be consistent with clinical reasoning and diagnostic pathways throughout multi-turn conversations. For this purpose, we propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge. These representations are utilized to (i) generate GPT-4-guided visual instruction tuning data at scale, simulating clinician-VLM conversations with demonstrations of clinical reasoning, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClinical Reasoning and Diagnostic Skills
