A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness
Lutao Jiang, Hangyu Li, Lin Wang

TL;DR
This paper introduces a novel framework that enhances 3D Gaussian Splatting initialization for text-to-3D generation, especially for complex, lexically rich texts, by integrating spatial and semantic interactions.
Contribution
The proposed framework improves 3D shape initialization by aggregating Gaussians into voxels and incorporating new modules for spatial and semantic interaction, outperforming existing methods.
Findings
Outperforms existing initialization methods like Shap-E.
Effectively handles lexically rich and complex texts.
Seamlessly integrates with state-of-the-art training frameworks.
Abstract
Text-to-3D content creation has recently received much attention, especially with the prevalence of 3D Gaussians Splatting. In general, GS-based methods comprise two key stages: initialization and rendering optimization. To achieve initialization, existing works directly apply random sphere initialization or 3D diffusion models, e.g., Point-E, to derive the initial shapes. However, such strategies suffer from two critical yet challenging problems: 1) the final shapes are still similar to the initial ones even after training; 2) shapes can be produced only from simple texts, e.g., "a dog", not for lexically richer texts, e.g., "a dog is sitting on the top of the airplane". To address these problems, this paper proposes a novel general framework to boost the 3D GS Initialization for text-to-3D generation upon the lexical richness. Our key idea is to aggregate 3D Gaussians into spatially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Human Motion and Animation · Handwritten Text Recognition Techniques
MethodsDiffusion
