General Image-to-Image Translation with One-Shot Image Guidance

Bin Cheng; Zuhao Liu; Yunbo Peng; Yue Lin

arXiv:2307.14352·cs.CV·September 21, 2023·1 cites

General Image-to-Image Translation with One-Shot Image Guidance

Bin Cheng, Zuhao Liu, Yunbo Peng, Yue Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel framework called visual concept translator (VCT) that enables content-preserving, one-shot image-to-image translation guided by a single reference image, outperforming existing methods in flexibility and quality.

Contribution

The paper presents VCT, a new framework with content-concept inversion and fusion processes for effective one-shot image translation guided by a single reference image.

Findings

01

VCT effectively preserves source image content while translating visual concepts.

02

VCT achieves superior results across various image translation tasks.

03

Extensive experiments demonstrate the method's effectiveness and superiority.

Abstract

Large-scale text-to-image models pre-trained on massive text-image pairs show excellent performance in image synthesis recently. However, image can provide more intuitive visual concepts than plain text. People may ask: how can we integrate the desired visual concept into an existing image, such as our portrait? Current methods are inadequate in meeting this demand as they lack the ability to preserve content or translate visual concepts effectively. Inspired by this, we propose a novel framework named visual concept translator (VCT) with the ability to preserve content in the source image and translate the visual concepts guided by a single reference image. The proposed VCT contains a content-concept inversion (CCI) process to extract contents and concepts, and a content-concept fusion (CCF) process to gather the extracted information to obtain the target image. Given only one…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

crystalneuro/visual-concept-translator
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Processing Techniques and Applications