SceneTeller: Language-to-3D Scene Generation

Ba\c{s}ak Melis \"Ocal; Maxim Tatarchenko; Sezer Karaoglu; Theo Gevers

arXiv:2407.20727·cs.CV·July 31, 2024

SceneTeller: Language-to-3D Scene Generation

Ba\c{s}ak Melis \"Ocal, Maxim Tatarchenko, Sezer Karaoglu, Theo Gevers

PDF

Open Access

TL;DR

SceneTeller introduces a novel text-to-3D scene generation system that enables users to create and modify high-quality indoor 3D scenes through natural language prompts, making 3D design accessible to non-experts.

Contribution

It pioneers a text-based 3D room design approach combining in-context learning, CAD retrieval, and stylization, advancing democratization of 3D scene creation.

Findings

01

Produces state-of-the-art 3D scenes from text prompts

02

Allows easy scene modification with additional text prompts

03

Accessible to users without professional 3D design skills

Abstract

Designing high-quality indoor 3D scenes is important in many practical applications, such as room planning or game development. Conventionally, this has been a time-consuming process which requires both artistic skill and familiarity with professional software, making it hardly accessible for layman users. However, recent advances in generative AI have established solid foundation for democratizing 3D design. In this paper, we propose a pioneering approach for text-based 3D room design. Given a prompt in natural language describing the object placement in the room, our method produces a high-quality 3D scene corresponding to it. With an additional text prompt the users can change the appearance of the entire scene or of individual objects in it. Built using in-context learning, CAD model retrieval and 3D-Gaussian-Splatting-based stylization, our turnkey pipeline produces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · 3D Modeling in Geospatial Applications