Simple diffusion: End-to-end diffusion for high resolution images
Emiel Hoogeboom, Jonathan Heek, Tim Salimans

TL;DR
This paper introduces a simplified approach to train standard diffusion models directly on high-resolution images, achieving state-of-the-art results without complex multi-stage or latent diffusion methods.
Contribution
It demonstrates that with specific adjustments like noise schedule tuning, selective scaling, dropout placement, and downsampling, high-resolution diffusion modeling can be both effective and straightforward.
Findings
Adjusted noise schedule improves high-res performance
Selective architecture scaling suffices for quality
Downsampling helps manage high-res feature maps
Abstract
Currently, applying diffusion models in pixel space of high resolution images is difficult. Instead, existing approaches focus on diffusion in lower dimensional spaces (latent diffusion), or have multiple super-resolution levels of generation referred to as cascades. The downside is that these approaches add additional complexity to the diffusion framework. This paper aims to improve denoising diffusion for high resolution images while keeping the model as simple as possible. The paper is centered around the research question: How can one train a standard denoising diffusion models on high resolution images, and still obtain performance comparable to these alternate approaches? The four main findings are: 1) the noise schedule should be adjusted for high resolution images, 2) It is sufficient to scale only a particular part of the architecture, 3) dropout should be added at specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Mathematical Biology Tumor Growth · Cell Image Analysis Techniques
MethodsDiffusion · Dropout
