Structured 3D Latents Are Surprisingly Powerful: Unleashing Generalizable Style with 2D Diffusion
Yiran Qiao, Yiren Lu, Yunlai Zhou, Disheng Liu, Linlin Hou, Rui Yang, Yu Yin, Jing Ma

TL;DR
This paper introduces DiLAST, a method that uses a pretrained 2D diffusion model to guide 3D style transfer, enabling generalizable style control even for out-of-distribution styles.
Contribution
It leverages 2D diffusion guidance to enhance 3D style transfer, addressing limitations of existing methods with in-distribution style reliance.
Findings
Enables stylization of 3D assets with out-of-distribution styles.
Improves 3D style transfer quality across various models.
Demonstrates effectiveness and versatility of the approach.
Abstract
3D asset generation plays a pivotal role in fields such as gaming and virtual reality, enabling the rapid synthesis of high-fidelity 3D objects from a single or multiple images. Building on this capability, enabling style-controllable generation naturally emerges as an important and desirable direction. However, existing approaches typically rely on style images that lie within or are similar to the training distribution of 3D generation models. When presented with out-of-distribution (OOD) styles, their performance degrades significantly or even fails. To address this limitation, we introduce \textbf{DiLAST}: 2D Diffusion-based Latent Awakening for 3D Style Transfer. Specifically, we leverage a pretrained 2D diffusion model as a teacher to provide rich and generalizable style priors. By aligning rendered views with the target style under diffusion-based guidance, our method optimizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
