TL;DR
CARD is a comprehensive multi-modal automotive dataset capturing dense 3D ground truth across diverse challenging road surfaces, enabling improved evaluation and development of depth estimation and perception models for autonomous driving.
Contribution
It introduces a new multi-modal dataset with quasi-dense 3D ground truth on challenging terrains, along with a standardized evaluation protocol and benchmarks for depth and perception tasks.
Findings
Multi-LiDAR fusion yields ~500K valid depth pixels per frame.
The dataset covers ~110 km and 4.7 hours of driving in challenging environments.
Benchmarking of state-of-the-art depth models establishes strong baselines.
Abstract
Autonomous driving must operate across diverse surfaces to enable safe mobility. However, most driving datasets are captured on well-paved flat roads. Moreover, recent driving datasets primarily provide sparse LiDAR ground truth for images, which is insufficient for assessing fine-grained geometry in depth estimation and completion. To address these gaps, we introduce CARD, a multi-modal driving dataset that delivers quasi-dense 3D ground truth across continuous sequences rich in speed bumps, potholes, irregular surfaces and off-road segments. Our sensor suite includes synchronized global-shutter stereo cameras, front and rear LiDARs, 6-DoF poses from LiDAR-inertial odometry, per-wheel motion traces, and full calibration. Notably, our multi-LiDAR fusion yields ~500K valid depth pixels per frame, about 6.5x more than KITTI Depth Completion and 10x more on average than other public…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
