CLIP-based Neural Neighbor Style Transfer for 3D Assets
Shailesh Mishra, Jonathan Granskog

TL;DR
This paper introduces a CLIP-based style transfer method for 3D assets, optimizing textures via differentiable rendering and feature matching to transfer style from images, emphasizing texture over shape.
Contribution
It presents a novel CLIP-based style loss for 3D texture transfer, supporting multiple images and automatic color palette extraction, improving style transfer quality.
Findings
CLIP-based loss emphasizes texture over shape.
Supports multiple style images for transfer.
Enables automatic color palette extraction.
Abstract
We present a method for transferring the style from a set of images to a 3D object. The texture appearance of an asset is optimized with a differentiable renderer in a pipeline based on losses using pretrained deep neural networks. More specifically, we utilize a nearest-neighbor feature matching loss with CLIP-ResNet50 to extract the style from images. We show that a CLIP- based style loss provides a different appearance over a VGG-based loss by focusing more on texture over geometric shapes. Additionally, we extend the loss to support multiple images and enable loss-based control over the color palette combined with automatic color palette extraction from style images.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques
