B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding

Changho Choi; Youngwoo Shin; Gyojin Han; Dong-Jae Lee; Junmo Kim

arXiv:2508.05269·cs.CV·May 13, 2026

B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding

Changho Choi, Youngwoo Shin, Gyojin Han, Dong-Jae Lee, Junmo Kim

PDF

1 Repo

TL;DR

B4DL introduces a new benchmark and model for multimodal large language models to understand 4D LiDAR data in dynamic outdoor environments, enabling spatio-temporal reasoning.

Contribution

It provides the first dataset, benchmark, and model capable of directly processing raw 4D LiDAR data for multimodal understanding.

Findings

01

The dataset includes rendered 4D LiDAR videos and annotations.

02

The proposed model effectively bridges 4D LiDAR with language understanding.

03

Benchmark results demonstrate improved spatio-temporal reasoning capabilities.

Abstract

Understanding dynamic outdoor environments requires capturing complex object interactions and their evolution over time. LiDAR-based 4D point clouds provide precise spatial geometry and rich temporal cues, making them ideal for representing real-world scenes. However, despite their potential, 4D LiDAR remains underexplored in the context of Multimodal Large Language Models (MLLMs) due to the absence of high-quality, modality-specific annotations and the lack of MLLM architectures capable of processing its high-dimensional composition. To address these challenges, we introduce B4DL, a new benchmark specifically designed for training and evaluating MLLMs on 4D LiDAR understanding. In addition, we propose a scalable data generation pipeline and an MLLM model that, for the first time, directly processes raw 4D LiDAR by bridging it with language understanding. Combined with our dataset and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ccho4702/B4DL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.