Style3D: Attention-guided Multi-view Style Transfer for 3D Object   Generation

Bingjie Song; Xin Huang; Ruting Xie; Xue Wang; Qing Wang

arXiv:2412.03571·cs.CV·December 5, 2024

Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation

Bingjie Song, Xin Huang, Ruting Xie, Xue Wang, Qing Wang

PDF

Open Access

TL;DR

Style3D introduces an attention-guided multi-view style transfer method that enables instant, coherent stylization of 3D objects from content and style images, improving efficiency and visual quality.

Contribution

It proposes a novel multi-view dual-feature alignment technique with MultiFusion Attention for scalable, instant 3D stylization without specialized training.

Findings

01

Outperforms existing methods in visual quality

02

Achieves higher computational efficiency

03

Ensures style consistency across multiple views

Abstract

We present Style3D, a novel approach for generating stylized 3D objects from a content image and a style image. Unlike most previous methods that require case- or style-specific training, Style3D supports instant 3D object stylization. Our key insight is that 3D object stylization can be decomposed into two interconnected processes: multi-view dual-feature alignment and sparse-view spatial reconstruction. We introduce MultiFusion Attention, an attention-guided technique to achieve multi-view stylization from the content-style pair. Specifically, the query features from the content image preserve geometric consistency across multiple views, while the key and value features from the style image are used to guide the stylistic transfer. This dual-feature alignment ensures that spatial coherence and stylistic fidelity are maintained across multi-view images. Finally, a large 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques

MethodsSoftmax · Attention Is All You Need