AIComposer: Any Style and Content Image Composition via Feature Integration

Haowen Li; Zhenfeng Fan; Zhang Wen; Zhengzhou Zhu; Yunjin Li

arXiv:2507.20721·cs.CV·July 29, 2025

AIComposer: Any Style and Content Image Composition via Feature Integration

Haowen Li, Zhenfeng Fan, Zhang Wen, Zhengzhou Zhu, Yunjin Li

PDF

TL;DR

AIComposer introduces a novel cross-domain image composition method that eliminates the need for text prompts, effectively preserves content, and achieves superior stylization and composition quality using diffusion models and feature integration.

Contribution

It is the first to enable text-prompt-free cross-domain image composition with a simple MLP and local cross-attention, improving robustness and style transfer without additional training.

Findings

01

Outperforms state-of-the-art in LPIPS and CSD metrics.

02

Preserves foreground content effectively during stylization.

03

Demonstrates robustness across diverse styles and contents.

Abstract

Image composition has advanced significantly with large-scale pre-trained T2I diffusion models. Despite progress in same-domain composition, cross-domain composition remains under-explored. The main challenges are the stochastic nature of diffusion models and the style gap between input images, leading to failures and artifacts. Additionally, heavy reliance on text prompts limits practical applications. This paper presents the first cross-domain image composition method that does not require text prompts, allowing natural stylization and seamless compositions. Our method is efficient and robust, preserving the diffusion prior, as it involves minor steps for backward inversion and forward denoising without training the diffuser. Our method also uses a simple multilayer perceptron network to integrate CLIP features from foreground and background, manipulating diffusion with a local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.