Vision-Language Models Encode Clinical Guidelines for Concept-Based Medical Reasoning

Mohamed Harmanani; Bining Long; Zhuoxin Guo; Paul F.R. Wilson; Amirhossein Sabour; Minh Nguyen Nhat To; Gabor Fichtinger; Purang Abolmaesumi; Parvin Mousavi

arXiv:2603.08921·cs.CV·March 11, 2026

Vision-Language Models Encode Clinical Guidelines for Concept-Based Medical Reasoning

Mohamed Harmanani, Bining Long, Zhuoxin Guo, Paul F.R. Wilson, Amirhossein Sabour, Minh Nguyen Nhat To, Gabor Fichtinger, Purang Abolmaesumi, Parvin Mousavi

PDF

Open Access

TL;DR

This paper introduces MedCBR, a concept-based reasoning framework that integrates clinical guidelines with vision-language models to improve interpretability and diagnostic accuracy in medical imaging.

Contribution

MedCBR combines clinical guidelines with vision-language models and reasoning to enhance interpretability and diagnostic performance in medical imaging tasks.

Findings

01

Achieved AUROC of 94.2% on ultrasound diagnosis

02

Achieved AUROC of 84.0% on mammography diagnosis

03

Generalized well to non-medical datasets with 86.1% accuracy

Abstract

Concept Bottleneck Models (CBMs) are a prominent framework for interpretable AI that map learned visual features to a set of meaningful concepts for task-specific downstream predictions. Their sequential structure enhances transparency by connecting model predictions to the underlying concepts that support them. In medical imaging, where transparency is essential, CBMs offer an appealing foundation for explainable model design. However, discrete concept representations often overlook broader clinical context such as diagnostic guidelines and expert heuristics, reducing reliability in complex cases. We propose MedCBR, a concept-based reasoning framework that integrates clinical guidelines with vision-language and reasoning models. Labeled clinical descriptors are transformed into guideline-conformant text, and a concept-based model is trained with a multitask objective combining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis