M2StyleGS: Multi-Modality 3D Style Transfer with Gaussian Splatting
Xingyu Miao, Xueqi Qiu, Haoran Duan, Yawen Huang, Xian Wu, Jingjing Deng, Yang Long

TL;DR
M2StyleGS is a real-time multi-modality 3D style transfer method using Gaussian Splatting and CLIP, enabling flexible reference inputs like text and images for consistent style transfer in 3D scenes.
Contribution
The paper introduces M2StyleGS, a novel 3D style transfer technique that employs Gaussian Splatting and multi-modality knowledge to improve flexibility and consistency.
Findings
Achieves up to 32.92% better consistency than previous methods.
Generates style-enhanced views from text or images as references.
Employs novel loss functions for better style matching and color suppression.
Abstract
Conventional 3D style transfer methods rely on a fixed reference image to apply artistic patterns to 3D scenes. However, in practical applications such as virtual or augmented reality, users often prefer more flexible inputs, including textual descriptions and diverse imagery. In this work, we introduce a novel real-time styling technique M2StyleGS to generate a sequence of precisely color-mapped views. It utilizes 3D Gaussian Splatting (3DGS) as a 3D presentation and multi-modality knowledge refined by CLIP as a reference style. M2StyleGS resolves the abnormal transformation issue by employing a precise feature alignment, namely subdivisive flow, it strengthens the projection of the mapped CLIP text-visual combination feature to the VGG style feature. In addition, we introduce observation loss, which assists in the stylized scene better matching the reference style during the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
