IP-Composer: Semantic Composition of Visual Concepts

Sara Dorfman; Dana Cohen-Bar; Rinon Gal; Daniel Cohen-Or

arXiv:2502.13951·cs.CV·February 20, 2025

IP-Composer: Semantic Composition of Visual Concepts

Sara Dorfman, Dana Cohen-Bar, Rinon Gal, Daniel Cohen-Or

PDF

Open Access

TL;DR

IP-Composer is a training-free method that combines multiple images and natural language to generate new images with precise control over complex visual concept compositions.

Contribution

It extends IP-Adapter to handle multiple visual inputs using composite CLIP embeddings, enabling more accurate and diverse image synthesis without additional training.

Findings

01

Enables compositional image generation with multiple references

02

Provides more precise control over complex visual concepts

03

Operates without training or specialized data

Abstract

Content creators often draw inspiration from multiple visual sources, combining distinct elements to craft new compositions. Modern computational approaches now aim to emulate this fundamental creative process. Although recent diffusion models excel at text-guided compositional synthesis, text as a medium often lacks precise control over visual details. Image-based composition approaches can capture more nuanced features, but existing methods are typically limited in the range of concepts they can capture, and require expensive training procedures or specialized data. We present IP-Composer, a novel training-free approach for compositional image generation that leverages multiple image references simultaneously, while using natural language to describe the concept to be extracted from each image. Our method builds on IP-Adapter, which synthesizes novel images conditioned on an input…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Cognitive Computing and Networks

MethodsDiffusion · Contrastive Language-Image Pre-training