ICE-G: Image Conditional Editing of 3D Gaussian Splats
Vishnu Jaganathan, Hannah Hanyun Huang, Muhammad Zubair Irshad, Varun, Jampani, Amit Raj, Zsolt Kira

TL;DR
ICE-G enables fast, high-quality, and customizable editing of 3D models from a single reference view by leveraging semantic segmentation, feature matching, and Gaussian Splat representations for versatile editing tasks.
Contribution
The paper introduces a novel method for 3D model editing that combines semantic segmentation and feature matching with Gaussian Splat representations for efficient and detailed modifications.
Findings
Produces higher quality editing results
Enables fine-grained control over 3D edits
Supports diverse editing tasks including style transfer
Abstract
Recently many techniques have emerged to create high quality 3D assets and scenes. When it comes to editing of these objects, however, existing approaches are either slow, compromise on quality, or do not provide enough customization. We introduce a novel approach to quickly edit a 3D model from a single reference view. Our technique first segments the edit image, and then matches semantically corresponding regions across chosen segmented dataset views using DINO features. A color or texture change from a particular region of the edit image can then be applied to other views automatically in a semantically sensible manner. These edited views act as an updated dataset to further train and re-style the 3D scene. The end-result is therefore an edited 3D model. Our framework enables a wide variety of editing tasks such as manual local edits, correspondence based style transfer from any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
MethodsAttention Is All You Need · Residual Connection · Softmax · Layer Normalization · Linear Layer · Multi-Head Attention · Dense Connections · Vision Transformer · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · self-DIstillation with NO labels
