
TL;DR
This paper introduces two methods, Instant3D for rapid 3D asset generation from multi-view diffusion and sparse-view reconstruction, and FastMap for accelerated 3D reconstruction, advancing automatic 3D content creation.
Contribution
It presents novel, faster algorithms for 3D content generation and reconstruction, significantly reducing processing time while maintaining quality.
Findings
Instant3D produces high-quality 3D assets in 5-20 seconds.
FastMap achieves up to 10x speedup over previous methods.
Both methods maintain comparable quality to existing approaches.
Abstract
Automatic 3D content creation seeks to replace labor-intensive modeling and scanning pipelines with systems that can synthesize or recover 3D assets directly from text or images. Its applications span video games, virtual reality, robotics, and simulation, enabling rapid asset prototyping, diverse interactive world generation, and efficient 3D data collection for training foundation models. Contemporary solutions largely follow two complementary paradigms: (i) text- or image-to-3D generation, which learns priors over 3D geometry and appearance to create novel assets from natural language or a single view image; and (ii) 3D reconstruction, which estimates camera poses and geometry from RGB images. This thesis advances both directions. On the generation side, I introduce Instant3D, which combines multi-view diffusion with feed-forward sparse-view 3D reconstruction to produce high-quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
