KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding   in 2D and 3D

Yiyi Liao; Jun Xie; Andreas Geiger

arXiv:2109.13410·cs.CV·June 6, 2022·29 cites

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D

Yiyi Liao, Jun Xie, Andreas Geiger

PDF

Open Access 2 Repos

TL;DR

KITTI-360 is a comprehensive urban scene dataset with rich multi-modal annotations designed to advance research in autonomous driving across vision, graphics, and robotics.

Contribution

It introduces a new large-scale dataset with 2D and 3D annotations, a novel annotation transfer tool, and benchmarks for multiple perception tasks.

Findings

01

Over 150k images with semantic annotations

02

1 billion 3D points with coherent labels

03

Benchmarks for scene understanding, view synthesis, and SLAM

Abstract

For the last few decades, several major subfields of artificial intelligence including computer vision, graphics, and robotics have progressed largely independently from each other. Recently, however, the community has realized that progress towards robust intelligent systems such as self-driving cars requires a concerted effort across the different fields. This motivated us to develop KITTI-360, successor of the popular KITTI dataset. KITTI-360 is a suburban driving dataset which comprises richer input modalities, comprehensive semantic instance annotations and accurate localization to facilitate research at the intersection of vision, graphics and robotics. For efficient annotation, we created a tool to label 3D scenes with bounding primitives and developed a model that transfers this information into the 2D image domain, resulting in over 150k images and 1B 3D points with coherent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques