SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code
Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A., Ross, Cordelia Schmid, Alireza Fathi

TL;DR
SceneCraft is an innovative LLM agent that converts text descriptions into Blender Python scripts, enabling the synthesis of complex 3D scenes with improved accuracy and efficiency through advanced planning, image analysis, and library learning.
Contribution
It introduces a novel approach combining scene graph modeling, iterative refinement with vision-language models, and a reusable script library to enhance 3D scene generation from text.
Findings
Outperforms existing LLM agents in scene rendering accuracy
Successfully reconstructs detailed scenes from complex movie data
Demonstrates potential for guiding video generation models
Abstract
This paper introduces SceneCraft, a Large Language Model (LLM) Agent converting text descriptions into Blender-executable Python scripts which render complex scenes with up to a hundred 3D assets. This process requires complex spatial planning and arrangement. We tackle these challenges through a combination of advanced abstraction, strategic planning, and library learning. SceneCraft first models a scene graph as a blueprint, detailing the spatial relationships among assets in the scene. SceneCraft then writes Python scripts based on this graph, translating relationships into numerical constraints for asset layout. Next, SceneCraft leverages the perceptual strengths of vision-language foundation models like GPT-V to analyze rendered images and iteratively refine the scene. On top of this process, SceneCraft features a library learning mechanism that compiles common script functions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
MethodsLib
