MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation

Jintao Cheng; Xingming Chen; Jinxin Liang; Xiaoyu Tang; Xieyuanli; Chen; Dachuan Li

arXiv:2408.10602·cs.CV·August 21, 2024

MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation

Jintao Cheng, Xingming Chen, Jinxin Liang, Xiaoyu Tang, Xieyuanli, Chen, Dachuan Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces MV-MOS, a multi-view feature fusion model that combines motion and semantic information from different 2D point cloud representations to improve 3D moving object segmentation in autonomous driving.

Contribution

The paper proposes a novel multi-view fusion approach that integrates BEV and range view features with semantic information using a Mamba module, enhancing segmentation accuracy.

Findings

01

Outperforms state-of-the-art models on SemanticKITTI benchmark

02

Effective multi-view fusion improves segmentation accuracy

03

Semantic features enhance motion segmentation performance

Abstract

Effectively summarizing dense 3D point cloud data and extracting motion information of moving objects (moving object segmentation, MOS) is crucial to autonomous driving and robotics applications. How to effectively utilize motion and semantic features and avoid information loss during 3D-to-2D projection is still a key challenge. In this paper, we propose a novel multi-view MOS model (MV-MOS) by fusing motion-semantic features from different 2D representations of point clouds. To effectively exploit complementary information, the motion branches of the proposed model combines motion features from both bird's eye view (BEV) and range view (RV) representations. In addition, a semantic branch is introduced to provide supplementary semantic features of moving objects. Finally, a Mamba module is utilized to fuse the semantic features with motion features and provide effective guidance for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chengjt1999/mv-mos
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces