Adapting Pretrained Vision-Language Foundational Models to Medical   Imaging Domains

Pierre Chambon; Christian Bluethgen; Curtis P. Langlotz; Akshay; Chaudhari

arXiv:2210.04133·cs.CV·January 3, 2023·44 cites

Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains

Pierre Chambon, Christian Bluethgen, Curtis P. Langlotz, Akshay, Chaudhari

PDF

Open Access

TL;DR

This paper adapts large pretrained vision-language models, specifically Stable Diffusion, to generate and manipulate medical images, addressing domain shift issues and improving clinical relevance through fine-tuning and evaluation.

Contribution

It demonstrates how to fine-tune the Stable Diffusion model for medical imaging, enabling realistic abnormality insertion while preserving diagnostic features.

Findings

01

Improved image quality metrics over baseline models

02

Radiologist evaluations confirm clinical relevance

03

Model maintains 95% abnormality detection accuracy

Abstract

Multi-modal foundation models are typically trained on millions of pairs of natural images and text captions, frequently obtained through web-crawling approaches. Although such models depict excellent generative capabilities, they do not typically generalize well to specific domains such as medical images that have fundamentally shifted distributions compared to natural images. Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets. Thus, in this study, we seek to research and expand the representational capabilities of large pretrained foundation models to medical concepts, specifically for leveraging the Stable Diffusion model to generate domain specific images found in medical imaging. We explore the sub-components of the Stable Diffusion pipeline (the variational autoencoder, the U-Net and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging · AI in cancer detection · Colorectal Cancer Screening and Detection

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Max Pooling · Convolution · U-Net · Diffusion