From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy
Bi'an Du, Daizong Liu, Pufan Li, Wei Hu

TL;DR
This paper introduces a novel 3D generative model that learns an adaptive part-whole hierarchy from images, enabling better generalization across categories and structural complexities by discovering and consolidating latent parts dynamically.
Contribution
It proposes an adaptive slot-gating mechanism and a class-agnostic prototype bank for flexible, compositional 3D generation from single images, improving over fixed-part models.
Findings
Enhanced cross-category transfer performance
Improved part-count extrapolation capabilities
Effective shape sharing via prototype bank
Abstract
Single-image 3D generation lies at the core of vision-to-graphics models in the real world. However, it remains a fundamental challenge to achieve reliable generalization across diverse semantic categories and highly variable structural complexity under sparse supervision. Existing approaches typically model objects in a monolithic manner or rely on a fixed number of parts, including recent part-aware models such as PartCrafter, which still require a labor-intensive user-specified part count. Such designs easily lead to overfitting, fragmented or missing structural components, and limited compositional generalization when encountering novel object layouts. To this end, this paper rethinks single-image 3D generation as learning an adaptive part-whole hierarchy in the flexible 3D latent space. We present a novel part-to-whole 3D generative world model that autonomously discovers latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
