CareCom: Generative Image Composition with Calibrated Reference Features

Jiaxuan Chen; Bo Zhang; Qingdong He; Jinlong Peng; Li Niu

arXiv:2511.11060·cs.CV·November 17, 2025

CareCom: Generative Image Composition with Calibrated Reference Features

Jiaxuan Chen, Bo Zhang, Qingdong He, Jinlong Peng, Li Niu

PDF

Open Access

TL;DR

This paper introduces CareCom, a generative image composition model that uses multiple calibrated reference images to improve detail preservation and pose/view adjustment in inserting foreground objects into backgrounds.

Contribution

The paper extends existing models to handle multiple references and proposes a calibration method for reference features to enhance composition quality.

Findings

01

Significant improvement in detail preservation and pose adjustment.

02

Effective use of multiple reference images for better composition.

03

Validated on MVImgNet and MureCom datasets.

Abstract

Image composition aims to seamlessly insert foreground object into background. Despite the huge progress in generative image composition, the existing methods are still struggling with simultaneous detail preservation and foreground pose/view adjustment. To address this issue, we extend the existing generative composition model to multi-reference version, which allows using arbitrary number of foreground reference images. Furthermore, we propose to calibrate the global and local features of foreground reference images to make them compatible with the background information. The calibrated reference features can supplement the original reference features with useful global and local information of proper pose/view. Extensive experiments on MVImgNet and MureCom demonstrate that the generative model can greatly benefit from the calibrated reference features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Visual Attention and Saliency Detection