The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes
Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen

TL;DR
The paper introduces H3D, a large-scale, fully annotated 3D LiDAR dataset of crowded urban scenes designed to advance research in 3D multi-object detection and tracking, addressing the limitations of existing datasets.
Contribution
It provides a new comprehensive dataset with rich annotations for full-surround 3D detection and tracking, along with a novel annotation methodology and a standardized benchmark for evaluation.
Findings
H3D contains 1 million labeled instances across 27,721 frames.
Benchmark results highlight current algorithm performance and error sources.
The dataset facilitates research on complex, crowded urban traffic scenes.
Abstract
3D multi-object detection and tracking are crucial for traffic scene understanding. However, the community pays less attention to these areas due to the lack of a standardized benchmark dataset to advance the field. Moreover, existing datasets (e.g., KITTI) do not provide sufficient data and labels to tackle challenging scenes where highly interactive and occluded traffic participants are present. To address the issues, we present the Honda Research Institute 3D Dataset (H3D), a large-scale full-surround 3D multi-object detection and tracking dataset collected using a 3D LiDAR scanner. H3D comprises of 160 crowded and highly interactive traffic scenes with a total of 1 million labeled instances in 27,721 frames. With unique dataset size, rich annotations, and complex scenes, H3D is gathered to stimulate research on full-surround 3D multi-object detection and tracking. To effectively and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote Sensing and LiDAR Applications · Autonomous Vehicle Technology and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
