Cog2Gen3D: Sculpturing 3D Semantic-Geometric Cognition for 3D Generation
Haonan Wang, Hanyu Zhou, Haoyue Liu, Tao Gu, Luxin Yan

TL;DR
Cog2Gen3D introduces a novel 3D cognition-guided diffusion framework that leverages semantic and geometric information to produce physically plausible and structurally rational 3D models, outperforming existing methods.
Contribution
The paper proposes a new framework combining semantic and geometric cognition for 3D generation, addressing scale inconsistency and enhancing physical plausibility.
Findings
Outperforms existing methods in semantic fidelity.
Achieves higher geometric plausibility.
Ensures structural rationality in 3D generation.
Abstract
Generative models have achieved success in producing semantically plausible 2D images, but it remains challenging in 3D generation due to the absence of spatial geometry constraints. Typically, existing methods utilize geometric features as conditions to enhance spatial awareness. However, these methods can only model relative relationships and are prone to scale inconsistency of absolute geometry. Thus, we argue that semantic information and absolute geometry empower 3D cognition, thereby enabling controllable 3D generation for the physical world. In this work, we propose Cog2Gen3D, a 3D cognition-guided diffusion framework for 3D generation. Our model is guided by three key designs: 1) Cognitive Feature Embeddings. We encode different modalities into semantic and geometric representations and further extract logical representations. 2) 3D Latent Cognition Graph. We structure different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Motion and Animation · Generative Adversarial Networks and Image Synthesis
