Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion

Li Liang; Naveed Akhtar; Jordan Vice; Xiangrui Kong; Ajmal Saeed Mian

arXiv:2501.07260·cs.CV·January 14, 2025

Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion

Li Liang, Naveed Akhtar, Jordan Vice, Xiangrui Kong, Ajmal Saeed Mian

PDF

1 Repo 1 Video

TL;DR

This paper introduces a novel diffusion-based neural model called Skip Mamba Diffusion for monocular 3D semantic scene completion, achieving state-of-the-art results by processing data in a latent space with an innovative denoiser.

Contribution

It presents the Skimba denoiser and a diffusion model in the state space for efficient 3D scene completion from monocular images, outperforming existing monocular methods.

Findings

01

Outperforms other monocular techniques significantly

02

Achieves competitive results with stereo methods

03

Demonstrates effectiveness on SemanticKITTI and KITTI360 datasets

Abstract

3D semantic scene completion is critical for multiple downstream tasks in autonomous systems. It estimates missing geometric and semantic information in the acquired scene data. Due to the challenging real-world conditions, this task usually demands complex models that process multi-modal data to achieve acceptable performance. We propose a unique neural model, leveraging advances from the state space and diffusion generative modeling to achieve remarkable 3D semantic scene completion performance with monocular image input. Our technique processes the data in the conditioned latent space of a variational autoencoder where diffusion modeling is carried out with an innovative state space technique. A key component of our neural network is the proposed Skimba (Skip Mamba) denoiser, which is adept at efficiently processing long-sequence data. The Skimba diffusion model is integral to our 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xrkong/skimba
noneOfficial

Videos

Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion· underline

Taxonomy

MethodsADaptive gradient method with the OPTimal convergence rate · Diffusion · Mamba: Linear-Time Sequence Modeling with Selective State Spaces