DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training
Guillermo Jimenez-Perez, Pedro Osorio, Josef Cersovsky, Javier, Montalt-Tordera, Jens Hooge, Steffen Vogler, Sadegh Mohammadi

TL;DR
DiNO-Diffusion is a self-supervised latent diffusion model trained on unlabelled chest X-ray images, enabling high-quality, diverse synthetic data generation and improved downstream task performance without annotated datasets.
Contribution
It introduces a novel self-supervised training method for diffusion models in medical imaging, removing the need for annotations and enabling scalable, versatile data generation.
Findings
Achieves FID scores as low as 4.7, indicating high-quality image generation.
Enhances classification AUC by up to 20% using synthetic data augmentation.
Demonstrates zero-shot lung lobe segmentation with up to 84.4% Dice score.
Abstract
Diffusion models (DMs) have emerged as powerful foundation models for a variety of tasks, with a large focus in synthetic image generation. However, their requirement of large annotated datasets for training limits their applicability in medical imaging, where datasets are typically smaller and sparsely annotated. We introduce DiNO-Diffusion, a self-supervised method for training latent diffusion models (LDMs) that conditions the generation process on image embeddings extracted from DiNO. By eliminating the reliance on annotations, our training leverages over 868k unlabelled images from public chest X-Ray (CXR) datasets. Despite being self-supervised, DiNO-Diffusion shows comprehensive manifold coverage, with FID scores as low as 4.7, and emerging properties when evaluated in downstream tasks. It can be used to generate semantically-diverse synthetic datasets even from small data pools,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Thermography in Medicine
MethodsAttention Is All You Need · Softmax · Residual Connection · Layer Normalization · Focus · Linear Layer · Diffusion · Multi-Head Attention · Dense Connections · Vision Transformer
