Palette: Image-to-Image Diffusion Models
Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan, Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

TL;DR
This paper introduces a unified diffusion-based framework for image-to-image translation tasks, demonstrating superior performance over GANs across multiple challenging applications without task-specific tuning.
Contribution
The paper presents a simple, unified diffusion model that outperforms specialized methods on various image translation tasks and advocates for standardized evaluation protocols.
Findings
Diffusion models outperform GANs and regression baselines on all tasks.
L2 loss increases sample diversity compared to L1 loss.
Self-attention improves neural architecture performance.
Abstract
This paper develops a unified framework for image-to-image translation based on conditional diffusion models and evaluates this framework on four challenging image-to-image translation tasks, namely colorization, inpainting, uncropping, and JPEG restoration. Our simple implementation of image-to-image diffusion models outperforms strong GAN and regression baselines on all tasks, without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss or sophisticated new techniques needed. We uncover the impact of an L2 vs. L1 loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention in the neural architecture through empirical studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, with human evaluation and sample quality scores (FID, Inception Score, Classification Accuracy of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Cancer-related molecular mechanisms research · Multimodal Machine Learning Applications
MethodsDiffusion
