Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields

Xingyu Miao; Haoran Duan; Yang Bai; Tejal Shah; Jun Song; Yang Long,; Rajiv Ranjan; and Ling Shao

arXiv:2501.19084·cs.CV·February 3, 2025

Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields

Xingyu Miao, Haoran Duan, Yang Bai, Tejal Shah, Jun Song, Yang Long,, Rajiv Ranjan, and Ling Shao

PDF

1 Repo

TL;DR

This paper introduces Laser, a novel method for efficient 3D scene segmentation guided by language, which distills dense CLIP features and employs innovative modules to improve accuracy, speed, and consistency in neural radiance fields.

Contribution

Laser presents a streamlined approach to language-guided 3D segmentation by directly distilling dense CLIP features and introducing modules for noise reduction, edge accuracy, and viewpoint consistency.

Findings

01

Outperforms state-of-the-art in speed and accuracy

02

Achieves precise 3D segmentation with reduced computational resources

03

Enhances segmentation consistency across viewpoints

Abstract

In this work, we propose a method that leverages CLIP feature distillation, achieving efficient 3D segmentation through language guidance. Unlike previous methods that rely on multi-scale CLIP features and are limited by processing speed and storage requirements, our approach aims to streamline the workflow by directly and effectively distilling dense CLIP features, thereby achieving precise segmentation of 3D scenes using text. To achieve this, we introduce an adapter module and mitigate the noise issue in the dense CLIP feature distillation process through a self-cross-training strategy. Moreover, to enhance the accuracy of segmentation edges, this work presents a low-rank transient query attention mechanism. To ensure the consistency of segmentation for similar colors under different viewpoints, we convert the segmentation task into a classification task through label volume, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xingy038/laser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Contrastive Language-Image Pre-training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Adapter