Detecting Unforeseen Data Properties with Diffusion Autoencoder   Embeddings using Spine MRI data

Robert Graf; Florian Hunecke; Soeren Pohl; Matan Atad; Hendrik; Moeller; Sophie Starck; Thomas Kroencke; Stefanie Bette; Fabian Bamberg,; Tobias Pischon; Thoralf Niendorf; Carsten Schmidt; Johannes C. Paetzold,; Daniel Rueckert; Jan S Kirschke

arXiv:2410.10220·cs.CV·October 15, 2024

Detecting Unforeseen Data Properties with Diffusion Autoencoder Embeddings using Spine MRI data

Robert Graf, Florian Hunecke, Soeren Pohl, Matan Atad, Hendrik, Moeller, Sophie Starck, Thomas Kroencke, Stefanie Bette, Fabian Bamberg,, Tobias Pischon, Thoralf Niendorf, Carsten Schmidt, Johannes C. Paetzold,, Daniel Rueckert, Jan S Kirschke

PDF

Open Access

TL;DR

This paper demonstrates that Diffusion Autoencoder embeddings can effectively uncover biases, data abnormalities, and protocol variations in large-scale spine MRI datasets, improving data quality assessment for medical imaging.

Contribution

It introduces the use of DAE embeddings for bias detection and data quality analysis in medical imaging, outperforming existing generative models in identifying hidden data properties.

Findings

01

DAE embeddings separate protected variables like sex and age.

02

t-SNE visualization reveals protocol variations such as head positioning.

03

Embeddings identify samples challenging for sex prediction models.

Abstract

Deep learning has made significant strides in medical imaging, leveraging the use of large datasets to improve diagnostics and prognostics. However, large datasets often come with inherent errors through subject selection and acquisition. In this paper, we investigate the use of Diffusion Autoencoder (DAE) embeddings for uncovering and understanding data characteristics and biases, including biases for protected variables like sex and data abnormalities indicative of unwanted protocol variations. We use sagittal T2-weighted magnetic resonance (MR) images of the neck, chest, and lumbar region from 11186 German National Cohort (NAKO) participants. We compare DAE embeddings with existing generative models like StyleGAN and Variational Autoencoder. Evaluations on a large-scale dataset consisting of sagittal T2-weighted MR images of three spine regions show that DAE embeddings effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging · Medical Imaging Techniques and Applications

MethodsHuMan(Expedia)||How do I get a human at Expedia? · Dense Connections · Adaptive Instance Normalization · Convolution · R1 Regularization · Diffusion · Feedforward Network · StyleGAN