SceneSeer: 3D Scene Design with Natural Language
Angel X. Chang, Mihail Eric, Manolis Savva, Christopher D. Manning

TL;DR
SceneSeer is an interactive system that enables users to create and refine 3D scenes using natural language, leveraging a learned spatial knowledge base to generate realistic arrangements based on user descriptions.
Contribution
The paper introduces SceneSeer, a novel system that translates natural language descriptions into 3D scenes, allowing iterative refinement through textual commands.
Findings
Generated scenes are comparable in quality to manually designed scenes.
Users can effectively refine scenes using simple natural language commands.
The system outperforms baseline methods in perceptual quality evaluations.
Abstract
Designing 3D scenes is currently a creative task that requires significant expertise and effort in using complex 3D design interfaces. This effortful design process starts in stark contrast to the easiness with which people can use language to describe real and imaginary environments. We present SceneSeer: an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language. A user provides input text from which we extract explicit constraints on the objects that should appear in the scene. Given these explicit constraints, the system then uses a spatial knowledge base learned from an existing database of 3D scenes and 3D object models to infer an arrangement of the objects forming a natural scene matching the input description. Using textual commands the user can then iteratively refine the created scene by adding, removing, replacing, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Motion and Animation · Handwritten Text Recognition Techniques
