Fast and Efficient: Mask Neural Fields for 3D Scene Segmentation

Zihan Gao; Lingling Li; Licheng Jiao; Fang Liu; Xu Liu; Wenping Ma,; Yuwei Guo; Shuyuan Yang

arXiv:2407.01220·cs.CV·December 20, 2024

Fast and Efficient: Mask Neural Fields for 3D Scene Segmentation

Zihan Gao, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma,, Yuwei Guo, Shuyuan Yang

PDF

Open Access 1 Repo

TL;DR

MaskField introduces a novel approach for 3D scene segmentation that efficiently leverages foundation models by decomposing mask and semantic features, avoiding complex regularization, and achieving faster convergence than previous methods.

Contribution

It proposes MaskField, a new method that decomposes mask and semantic feature distillation, improving efficiency and accuracy in 3D scene segmentation from 2D models.

Findings

01

Surpasses prior state-of-the-art methods in 3D segmentation accuracy.

02

Achieves remarkably fast convergence during training.

03

Naturally incorporates SAM segmented object shapes without extra regularization.

Abstract

Understanding 3D scenes is a crucial challenge in computer vision research with applications spanning multiple domains. Recent advancements in distilling 2D vision-language foundation models into neural fields, like NeRF and 3DGS, enable open-vocabulary segmentation of 3D scenes from 2D multi-view images without the need for precise 3D annotations. However, while effective, these methods typically rely on the per-pixel distillation of high-dimensional CLIP features, introducing ambiguity and necessitating complex regularization strategies, which adds inefficiency during training. This paper presents MaskField, which enables efficient 3D open-vocabulary segmentation with neural fields from a novel perspective. Unlike previous methods, MaskField decomposes the distillation of mask and semantic features from foundation models by formulating a mask feature field and queries. MaskField…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

keloee/maskfield
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Image and Object Detection Techniques

MethodsContrastive Language-Image Pre-training · Segment Anything Model