GEM: Generating LiDAR World Model via Deformable Mamba

Yang Wu; Zhaojiang Liu; Qiang Meng; Youquan Liu; Renliang Weng; Jianjun Qian; Jian Yang; Jin Xie

arXiv:2605.07326·cs.CV·May 11, 2026

GEM: Generating LiDAR World Model via Deformable Mamba

Yang Wu, Zhaojiang Liu, Qiang Meng, Youquan Liu, Renliang Weng, Jianjun Qian, Jian Yang, Jin Xie

PDF

1 Repo

TL;DR

GEM introduces a deformable mamba-based generative model for LiDAR world modeling, improving fidelity and dynamic scene understanding for autonomous driving applications.

Contribution

The paper presents a novel deformable mamba architecture tailored for LiDAR data, enabling better spatial-temporal modeling and scene generation capabilities.

Findings

01

Achieves state-of-the-art performance on multiple benchmarks.

02

Effectively disentangles static and dynamic features in LiDAR data.

03

Demonstrates potential for autonomous planning and 'what-if' scenario generation.

Abstract

World models, which simulate environmental dynamics and generate sensor observations, are gaining increasing attention in autonomous driving. However, progress in LiDAR-based world models has lagged behind those built on camera videos or occupancy data, primarily due to two core challenges: the inherent disorder of LiDAR point clouds and the difficulty of distinguishing dynamic objects from static structures. To address these issues, we propose GEM: a Generative LiDAR world model that leverages deformable mamba architecture, significantly improving fidelity and imaginative capability. Specifically, leveraging the structural similarity between sequential laser scanning and Mamba's processing mechanism, we first tokenize LiDAR sweeps into compact representations via a custom LiDAR scene tokenizer. After unsupervised disentanglement of tokenized features via a dynamic-static separator, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wuyang98/GEM
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.