Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract   Scene Descriptions

Ian Huang; Vrishab Krishna; Omoruyi Atekha; Leonidas Guibas

arXiv:2306.06212·cs.CV·June 13, 2023·1 cites

Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions

Ian Huang, Vrishab Krishna, Omoruyi Atekha, Leonidas Guibas

PDF

Open Access 1 Repo

TL;DR

This paper introduces Aladdin, a system leveraging foundation models to generate stylized 3D scene assets from abstract descriptions, enabling open-world creativity and reducing reliance on limited datasets.

Contribution

Aladdin is the first system to translate abstract scene descriptions into stylized 3D assets using foundation models with an interpretable, controllable pipeline.

Findings

01

Achieves 91% human-rated semantic faithfulness

02

Uses a multi-model foundation approach for open-world concepts

03

Introduces novel metrics for stylized 3D asset generation

Abstract

What constitutes the "vibe" of a particular scene? What should one find in "a busy, dirty city street", "an idyllic countryside", or "a crime scene in an abandoned living room"? The translation from abstract scene descriptions to stylized scene elements cannot be done with any generality by extant systems trained on rigid and limited indoor datasets. In this paper, we propose to leverage the knowledge captured by foundation models to accomplish this translation. We present a system that can serve as a tool to generate stylized assets for 3D scenes described by a short phrase, without the need to enumerate the objects to be found within the scene or give instructions on their appearance. Additionally, it is robust to open-world concepts in a way that traditional methods trained on limited data are not, affording more creative freedom to the 3D artist. Our system demonstrates this using a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ianhuang0630/aladdin
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging

MethodsDiffusion