Pretraining is All You Need for Image-to-Image Translation

Tengfei Wang; Ting Zhang; Bo Zhang; Hao Ouyang; Dong Chen; Qifeng; Chen; Fang Wen

arXiv:2205.12952·cs.CV·May 26, 2022·90 cites

Pretraining is All You Need for Image-to-Image Translation

Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng, Chen, Fang Wen

PDF

Open Access 2 Repos 7 Models

TL;DR

This paper introduces a pretraining-based framework for image-to-image translation that leverages pretrained diffusion models, significantly improving realism and fidelity across diverse tasks without extensive task-specific architecture design.

Contribution

It presents a simple, generic approach that adapts pretrained diffusion models for various image translation tasks, incorporating adversarial training and guidance sampling for enhanced quality.

Findings

01

Outperforms existing methods on benchmarks like ADE20K, COCO-Stuff, and DIODE.

02

Produces images with unprecedented realism and faithfulness.

03

Requires less task-specific architecture design.

Abstract

We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Cancer-related molecular mechanisms research

MethodsDiffusion