123D: Unifying Multi-Modal Autonomous Driving Data at Scale

Daniel Dauner; Valentin Charraut; Bastian Berle; Tianyu Li; Long Nguyen; Jiabao Wang; Changhui Jing; Maximilian Igl; Holger Caesar; Boris Ivanovic; Yiyi Liao; Andreas Geiger; Kashyap Chitta

arXiv:2605.08084·cs.RO·May 11, 2026

123D: Unifying Multi-Modal Autonomous Driving Data at Scale

Daniel Dauner, Valentin Charraut, Bastian Berle, Tianyu Li, Long Nguyen, Jiabao Wang, Changhui Jing, Maximilian Igl, Holger Caesar, Boris Ivanovic, Yiyi Liao, Andreas Geiger, Kashyap Chitta

PDF

1 Repo

TL;DR

123D is an open-source framework that unifies diverse multi-modal autonomous driving datasets into a single API, enabling cross-dataset analysis, transfer learning, and reinforcement learning applications.

Contribution

The paper introduces 123D, a unified platform that consolidates multiple autonomous driving datasets, facilitating easier analysis, comparison, and application development across datasets.

Findings

01

Consolidated 8 real-world datasets totaling 3,300 hours and 90,000 km.

02

Enabled cross-dataset 3D object detection transfer.

03

Demonstrated reinforcement learning for planning using unified data.

Abstract

The pursuit of autonomous driving has produced one of the richest sensor data collections in all of robotics. However, its scale and diversity remain largely untapped. Each dataset adopts different 2D and 3D modalities, such as cameras, lidar, ego states, annotations, traffic lights, and HD maps, with different rates and synchronization schemes. They come in fragmented formats requiring complex dependencies that cannot natively coexist in the same development environment. Further, major inconsistencies in annotation conventions prevent training or measuring generalization across multiple datasets. We present 123D, an open-source framework that unifies such multi-modal driving data through a single API. To handle synchronization, we store each modality as an independent timestamped event stream with no prescribed rate, enabling synchronous or asynchronous access across arbitrary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kesai-labs/py123d
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.