Deep Generative Models Unveil Patterns in Medical Images Through   Vision-Language Conditioning

Xiaodan Xing; Junzhi Ning; Yang Nan; Guang Yang

arXiv:2410.13823·cs.CV·October 18, 2024

Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning

Xiaodan Xing, Junzhi Ning, Yang Nan, Guang Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach using vision-language conditioned deep generative models to reveal patterns in medical images, leveraging clinical data transformed into textual descriptions to improve interpretability and visualization.

Contribution

The study presents a new method combining clinical data and segmentation masks in generative models, utilizing text-visual embeddings to enhance pattern discovery in medical images.

Findings

01

Demonstrated consistent lung intensity shifts related to smoking status

02

Effectively visualized clinical attribute impacts on medical images

03

Generalizable to both GAN and diffusion models

Abstract

Deep generative models have significantly advanced medical imaging analysis by enhancing dataset size and quality. Beyond mere data augmentation, our research in this paper highlights an additional, significant capacity of deep generative models: their ability to reveal and demonstrate patterns in medical images. We employ a generative structure with hybrid conditions, combining clinical data and segmentation masks to guide the image synthesis process. Furthermore, we innovatively transformed the tabular clinical data into textual descriptions. This approach simplifies the handling of missing values and also enables us to leverage large pre-trained vision-language models that investigate the relations between independent clinical entries and comprehend general terms, such as gender and smoking status. Our approach differs from and presents a more challenging task than traditional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junzhin/dgm-vlc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Multimodal Machine Learning Applications · Radiomics and Machine Learning in Medical Imaging

MethodsDiffusion