CompoSE: Compositional Synthesis and Editing of 3D Shapes via Part-Aware Control

Habib Slim; Shariq Farooq Bhat; Mohamed Elhoseiny; Yifan Wang; Mike Roberts

arXiv:2605.19350·cs.GR·May 20, 2026

CompoSE: Compositional Synthesis and Editing of 3D Shapes via Part-Aware Control

Habib Slim, Shariq Farooq Bhat, Mohamed Elhoseiny, Yifan Wang, Mike Roberts

PDF

TL;DR

CompoSE is a novel diffusion transformer-based method that enables part-aware, localized editing and synthesis of 3D shapes from coarse geometric primitives, learning part semantics without text prompts.

Contribution

Introducing CompoSE, a new approach that synthesizes and edits 3D shapes using part-aware control and a diffusion transformer architecture, without requiring part-level text prompts.

Findings

01

Outperforms existing methods on guided 3D shape synthesis

02

Enables localized editing operations like substitution, addition, deletion, resizing

03

Learns part semantics and symmetries directly from coarse layouts

Abstract

Creating and editing high-quality 3D content remains a central challenge in computer graphics. We address this challenge by introducing CompoSE, a novel method for Compositional Synthesis and Editing of 3D shapes via part-aware control. Our method takes as input a set of coarse geometric primitives (e.g., bounding boxes) that represent distinct object parts arranged in a particular spatial configuration, and synthesizes as output part-separated 3D objects that support localized granular (i.e., compositional) editing of individual parts. The key insight that enables our method is our use of a diffusion transformer architecture that alternates between processing each part locally and aggregating contextual information across parts globally, and features a novel conditioning technique that ensures strong adherence to the user's input. Importantly, our method learns to infer part semantics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.