Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng, Chen, Fang Wen

TL;DR
This paper introduces a pretraining-based framework for image-to-image translation that leverages pretrained diffusion models, significantly improving realism and fidelity across diverse tasks without extensive task-specific architecture design.
Contribution
It presents a simple, generic approach that adapts pretrained diffusion models for various image translation tasks, incorporating adversarial training and guidance sampling for enhanced quality.
Findings
Outperforms existing methods on benchmarks like ADE20K, COCO-Stuff, and DIODE.
Produces images with unprecedented realism and faithfulness.
Requires less task-specific architecture design.
Abstract
We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗rinna/japanese-stable-diffusionmodel· 15 dl· ♡ 17815 dl♡ 178
- 🤗svjack/Stable-Diffusion-Pokemon-jamodel· 9 dl· ♡ 49 dl♡ 4
- 🤗svjack/Stable-Diffusion-Pokemon-enmodel· 10 dl· ♡ 410 dl♡ 4
- 🤗svjack/Stable-Diffusion-Pokemon-zhmodel· ♡ 5♡ 5
- 🤗svjack/Stable-Diffusion-FineTuned-zh-v0model· 4 dl· ♡ 24 dl♡ 2
- 🤗svjack/Stable-Diffusion-FineTuned-zh-v1model· 8 dl· ♡ 58 dl♡ 5
- 🤗svjack/Stable-Diffusion-FineTuned-zh-v2model· 10 dl· ♡ 610 dl♡ 6
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cancer-related molecular mechanisms research
MethodsDiffusion
