ShapeUP: Scalable Image-Conditioned 3D Editing

Inbar Gat; Dana Cohen-Bar; Guy Levy; Elad Richardson; Daniel Cohen-Or

arXiv:2602.05676·cs.CV·April 28, 2026

ShapeUP: Scalable Image-Conditioned 3D Editing

Inbar Gat, Dana Cohen-Bar, Guy Levy, Elad Richardson, Daniel Cohen-Or

PDF

TL;DR

ShapeUP introduces a scalable 3D editing framework that uses a supervised latent translation approach with a 3D Diffusion Transformer, enabling precise, controllable, and consistent 3D asset modifications.

Contribution

It presents a novel image-conditioned 3D editing method that leverages a pretrained 3D foundation model and supervised training for improved control and scalability.

Findings

01

Outperforms existing methods in identity preservation and edit fidelity.

02

Enables fine-grained, mask-free local and global 3D edits.

03

Maintains structural consistency with original assets.

Abstract

Recent advancements in 3D foundation models have enabled the generation of high-fidelity assets, yet precise 3D manipulation remains a significant challenge. Existing 3D editing frameworks often face a difficult trade-off between visual controllability, geometric consistency, and scalability. Specifically, optimization-based methods are prohibitively slow, multi-view 2D propagation techniques suffer from visual drift, and training-free latent manipulation methods are inherently bound by frozen priors and cannot directly benefit from scaling. In this work, we present ShapeUP, a scalable, image-conditioned 3D editing framework that formulates editing as a supervised latent-to-latent translation within a native 3D representation. This formulation allows ShapeUP to build on a pretrained 3D foundation model, leveraging its strong generative prior while adapting it to editing through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.