KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
Yiyi Liao, Jun Xie, Andreas Geiger

TL;DR
KITTI-360 is a comprehensive urban scene dataset with rich multi-modal annotations designed to advance research in autonomous driving across vision, graphics, and robotics.
Contribution
It introduces a new large-scale dataset with 2D and 3D annotations, a novel annotation transfer tool, and benchmarks for multiple perception tasks.
Findings
Over 150k images with semantic annotations
1 billion 3D points with coherent labels
Benchmarks for scene understanding, view synthesis, and SLAM
Abstract
For the last few decades, several major subfields of artificial intelligence including computer vision, graphics, and robotics have progressed largely independently from each other. Recently, however, the community has realized that progress towards robust intelligent systems such as self-driving cars requires a concerted effort across the different fields. This motivated us to develop KITTI-360, successor of the popular KITTI dataset. KITTI-360 is a suburban driving dataset which comprises richer input modalities, comprehensive semantic instance annotations and accurate localization to facilitate research at the intersection of vision, graphics and robotics. For efficient annotation, we created a tool to label 3D scenes with bounding primitives and developed a model that transfers this information into the 2D image domain, resulting in over 150k images and 1B 3D points with coherent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
