TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Shilin Lu; Yanzhu Liu; Adams Wai-Kin Kong

arXiv:2307.12493·cs.CV·October 11, 2023·2 cites

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Shilin Lu, Yanzhu Liu, Adams Wai-Kin Kong

PDF

Open Access 2 Repos

TL;DR

TF-ICON is a training-free framework that uses text-driven diffusion models for cross-domain image composition, enabling seamless object integration without additional training or fine-tuning.

Contribution

It introduces a novel training-free approach leveraging off-the-shelf diffusion models and an exceptional prompt for real image inversion, improving cross-domain image composition.

Findings

01

Outperforms state-of-the-art inversion methods on multiple datasets

02

Surpasses prior baselines in diverse visual domains

03

Operates without additional training or fine-tuning

Abstract

Text-driven diffusion models have exhibited impressive generative capabilities, enabling various image editing tasks. In this paper, we propose TF-ICON, a novel Training-Free Image COmpositioN framework that harnesses the power of text-driven diffusion models for cross-domain image-guided composition. This task aims to seamlessly integrate user-provided objects into a specific visual context. Current diffusion-based methods often involve costly instance-based optimization or finetuning of pretrained models on customized datasets, which can potentially undermine their rich prior. In contrast, TF-ICON can leverage off-the-shelf diffusion models to perform cross-domain image-guided composition without requiring additional training, finetuning, or optimization. Moreover, we introduce the exceptional prompt, which contains no information, to facilitate text-driven diffusion models in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsDiffusion