Geometry-Editable and Appearance-Preserving Object Compositon

Jianman Lin; Haojie Li; Chunmei Qing; Zhijing Yang; Liang Lin; Tianshui Chen

arXiv:2505.20914·cs.CV·May 19, 2026

Geometry-Editable and Appearance-Preserving Object Compositon

Jianman Lin, Haojie Li, Chunmei Qing, Zhijing Yang, Liang Lin, Tianshui Chen

PDF

1 Repo

TL;DR

The paper introduces DGAD, a diffusion-based model that enables geometry-editable and appearance-preserving object composition by combining semantic embeddings with a cross-attention retrieval mechanism.

Contribution

DGAD is the first model to explicitly disentangle geometry editing and appearance preservation in object composition using diffusion models and cross-attention.

Findings

01

DGAD achieves superior geometric editing and appearance preservation on benchmarks.

02

The cross-attention retrieval mechanism effectively aligns fine-grained appearance features.

03

Experiments demonstrate DGAD's ability to produce realistic, geometry-adjusted composite images.

Abstract

General object composition (GOC) aims to seamlessly integrate a target object into a background scene with desired geometric properties, while simultaneously preserving its fine-grained appearance details. Recent approaches derive semantic embeddings and integrate them into advanced diffusion models to enable geometry-editable generation. However, these highly compact embeddings encode only high-level semantic cues and inevitably discard fine-grained appearance details. We introduce a Disentangled Geometry-editable and Appearance-preserving Diffusion (DGAD) model that first leverages semantic embeddings to implicitly capture the desired geometric transformations and then employs a cross-attention retrieval mechanism to align fine-grained appearance features with the geometry-edited representation, facilitating both precise geometry editing and faithful appearance preservation in object…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jianmanlincjx/DGAD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction · Robotics and Sensor-Based Localization