The Learnability Gap in Medical Latent Diffusion
Mischa Dombrowski, Felix N\"utzel, Bernhard Kainz

TL;DR
This paper investigates the learnability gap in medical latent diffusion models, revealing that autoencoder latent spaces are difficult for classifiers to learn from, and proposes methods to partially address this issue.
Contribution
It formalizes the learnability gap in autoencoder latent spaces for medical imaging and introduces noise-conditioned classifiers with distillation to improve learnability.
Findings
The learnability gap persists across multiple autoencoder architectures and medical benchmarks.
Fine-tuning autoencoders on medical data does not close the learnability gap.
Proposed latent classifiers achieve significant throughput and memory efficiency while serving as diagnostic tools.
Abstract
Generative data augmentation with latent diffusion models is a promising strategy for addressing class imbalance in medical imaging, yet current approaches focus on perceptual fidelity and domain-specific autoencoder fine-tuning while neglecting a more fundamental bottleneck. We identify and formalize the learnability gap: large-scale pretrained autoencoders faithfully encode discriminative features for medical classification, as evidenced by near-lossless performance in reconstruction space, yet their latent representations are structured in ways that are difficult for classifiers to learn from. Across five autoencoder families and four medical benchmarks spanning chest radiography, dermatoscopy, computed tomography, and echocardiography, we show that this gap persists regardless of architecture, initialization strategy, or hyperparameter tuning, and that medical-domain fine-tuning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
