Harnessing EHRs for Diffusion-based Anomaly Detection on Chest X-rays
Harim Kim, Yuhan Wang, Minkyu Ahn, Heeyoul Choi, Yuyin Zhou, and Charmgil Hong

TL;DR
This paper introduces Diff3M, a multi-modal diffusion framework that combines chest X-rays and EHR data with cross-attention to improve unsupervised anomaly detection in medical imaging.
Contribution
The paper presents a novel image-EHR cross-attention module and static masking strategy, enhancing diffusion-based anomaly detection by integrating clinical context.
Findings
Diff3M outperforms existing UAD methods on CheXpert and MIMIC-CXR/IV datasets.
Incorporating EHR data improves differentiation between normal and abnormal images.
The proposed methods achieve state-of-the-art results in medical anomaly detection.
Abstract
Unsupervised anomaly detection (UAD) in medical imaging is crucial for identifying pathological abnormalities without requiring extensive labeled data. However, existing diffusion-based UAD models rely solely on imaging features, limiting their ability to distinguish between normal anatomical variations and pathological anomalies. To address this, we propose Diff3M, a multi-modal diffusion-based framework that integrates chest X-rays and structured Electronic Health Records (EHRs) for enhanced anomaly detection. Specifically, we introduce a novel image-EHR cross-attention module to incorporate structured clinical context into the image generation process, improving the model's ability to differentiate normal from abnormal features. Additionally, we develop a static masking strategy to enhance the reconstruction of normal-like images from anomalies. Extensive evaluations on CheXpert and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
