BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang, Guandao Yang, Leonidas Guibas

TL;DR
This paper introduces BlenderAlchemy, a system that uses vision-language models to automate and assist complex 3D scene editing tasks in Blender by translating user intents into sequences of actions, enhanced with visual reasoning.
Contribution
It presents a novel approach combining VLMs with visual imagination to generate and evaluate editing sequences for 3D graphics design automation.
Findings
Successfully automates Blender editing sequences for materials, geometry, and lighting.
Demonstrates the system's ability to interpret text and images for scene editing.
Provides empirical evidence of effective visual grounding in design tasks.
Abstract
Graphics design is important for various applications, including movie production and game design. To create a high-quality scene, designers usually need to spend hours in software like Blender, in which they might need to interleave and repeat operations, such as connecting material nodes, hundreds of times. Moreover, slightly different design goals may require completely different sequences, making automation difficult. In this paper, we propose a system that leverages Vision-Language Models (VLMs), like GPT-4V, to intelligently search the design action space to arrive at an answer that can satisfy a user's intent. Specifically, we design a vision-based edit generator and state evaluator to work together to find the correct sequence of actions to achieve the goal. Inspired by the role of visual imagination in the human design process, we supplement the visual reasoning capabilities of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · 3D Modeling in Geospatial Applications
MethodsRoIPool · Softmax · RoIAlign
