RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Pierre Chambon, Christian Bluethgen, Jean-Benoit Delbrouck, Rogier Van, der Sluijs, Ma{\l}gorzata Po{\l}acin, Juan Manuel Zambrano Chaves, Tanishq, Mathew Abraham, Shivanshu Purohit, Curtis P. Langlotz, Akshay Chaudhari

TL;DR
RoentGen is a vision-language model adapted for chest X-ray generation, capable of producing diverse, high-quality synthetic images conditioned on radiology reports, aiding data augmentation and disease understanding.
Contribution
This work introduces a novel adaptation of a latent diffusion model for medical image synthesis, bridging the gap between natural image-text models and medical imaging domain-specific requirements.
Findings
Generated images are visually convincing and diverse.
Synthetic data improves classifier performance by up to 5%.
Fine-tuning enhances disease representation in the text encoder.
Abstract
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in generating high-quality images. Medical imaging data is fundamentally different to natural images, and the language used to succinctly capture relevant details in medical data uses a different, narrow but semantically rich, domain-specific vocabulary. Not surprisingly, multi-modal models trained on natural image-text pairs do not tend to generalize well to the medical domain. Developing generative imaging models faithfully representing medical concepts while providing compositional diversity could mitigate the existing paucity of high-quality, annotated medical imaging datasets. In this work, we develop a strategy to overcome the large natural-medical distributional shift by adapting a pre-trained latent diffusion model on a corpus of publicly available chest x-rays (CXR) and their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColorectal Cancer Screening and Detection · Radiomics and Machine Learning in Medical Imaging · AI in cancer detection
MethodsLatent Diffusion Model · Diffusion
