WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents
Xinhang Liu, Chi-Keung Tang, Yu-Wing Tai

TL;DR
WorldCraft is a system that uses large language model agents to generate and customize photorealistic 3D virtual worlds through natural language commands, making scene creation accessible to non-professionals.
Contribution
It introduces a novel LLM-based framework with specialized agents for scene generation, customization, and animation, enabling intuitive and detailed 3D world creation.
Findings
Demonstrates versatility across various scene complexities
Enables precise object customization via auto-verification
Supports natural language control for scene layout and animation
Abstract
Constructing photorealistic virtual worlds has applications across various fields, but it often requires the extensive labor of highly trained professionals to operate conventional 3D modeling software. To democratize this process, we introduce WorldCraft, a system where large language model (LLM) agents leverage procedural generation to create indoor and outdoor scenes populated with objects, allowing users to control individual object attributes and the scene layout using intuitive natural language commands. In our framework, a coordinator agent manages the overall process and works with two specialized LLM agents to complete the scene creation: ForgeIt, which integrates an ever-growing manual through auto-verification to enable precise customization of individual objects, and ArrangeIt, which formulates hierarchical optimization problems to achieve a layout that balances ergonomic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
