From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy

Bi'an Du; Daizong Liu; Pufan Li; Wei Hu

arXiv:2603.21557·cs.CV·March 24, 2026

From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy

Bi'an Du, Daizong Liu, Pufan Li, Wei Hu

PDF

Open Access

TL;DR

This paper introduces a novel 3D generative model that learns an adaptive part-whole hierarchy from images, enabling better generalization across categories and structural complexities by discovering and consolidating latent parts dynamically.

Contribution

It proposes an adaptive slot-gating mechanism and a class-agnostic prototype bank for flexible, compositional 3D generation from single images, improving over fixed-part models.

Findings

01

Enhanced cross-category transfer performance

02

Improved part-count extrapolation capabilities

03

Effective shape sharing via prototype bank

Abstract

Single-image 3D generation lies at the core of vision-to-graphics models in the real world. However, it remains a fundamental challenge to achieve reliable generalization across diverse semantic categories and highly variable structural complexity under sparse supervision. Existing approaches typically model objects in a monolithic manner or rely on a fixed number of parts, including recent part-aware models such as PartCrafter, which still require a labor-intensive user-specified part count. Such designs easily lead to overfitting, fragmented or missing structural components, and limited compositional generalization when encountering novel object layouts. To this end, this paper rethinks single-image 3D generation as learning an adaptive part-whole hierarchy in the flexible 3D latent space. We present a novel part-to-whole 3D generative world model that autonomously discovers latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis