Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation

Hubert Kompanowski; Binh-Son Hua

arXiv:2406.18581·cs.CV·February 14, 2025

Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation

Hubert Kompanowski, Binh-Son Hua

PDF

Open Access

TL;DR

Dream-in-Style introduces a novel text-to-3D generation method that incorporates style transfer using a stylized score distillation loss, enabling the creation of stylized 3D objects aligned with text prompts.

Contribution

The paper proposes a new stylized score distillation technique that combines pretrained text-to-image models with style reference images for unified 3D object and style generation.

Findings

01

Outperforms state-of-the-art methods in visual quality

02

Achieves strong style transfer fidelity

03

User study confirms preference for generated models

Abstract

We present a method to generate 3D objects in styles. Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model with the content aligning with the text prompt and the style following the reference image. To simultaneously generate the 3D object and perform style transfer in one go, we propose a stylized score distillation loss to guide a text-to-3D optimization process to output visually plausible geometry and appearance. Our stylized score distillation is based on a combination of an original pretrained text-to-image model and its modified sibling with the key and value features of self-attention layers manipulated to inject styles from the reference image. Comparisons with state-of-the-art methods demonstrated the strong visual performance of our method, further supported by the quantitative results from our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Video Analysis and Summarization · Natural Language Processing Techniques