Towards Robust Robot 3D Perception in Urban Environments: The UT Campus Object Dataset
Arthur Zhang, Chaitanya Eranki, Christina Zhang, Ji-Hwan Park, Raymond, Hong, Pranav Kalyani, Lochana Kalyanaraman, Arsh Gamare, Arnav Bagad, Maria, Esteva, Joydeep Biswas

TL;DR
The UT Campus Object Dataset (CODa) provides a comprehensive multimodal dataset for urban robot perception, enabling improved 3D object detection and semantic segmentation through extensive annotations and diverse environmental conditions.
Contribution
This paper introduces CODa, a new large-scale egocentric perception dataset with multimodal sensor data, extensive annotations, and benchmarks for urban robot perception tasks.
Findings
Training on CODa enhances 3D detection performance in urban settings.
Sensor-specific fine-tuning improves detection accuracy.
Pretraining on CODa outperforms AV datasets for cross-dataset detection.
Abstract
We introduce the UT Campus Object Dataset (CODa), a mobile robot egocentric perception dataset collected on the University of Texas Austin Campus. Our dataset contains 8.5 hours of multimodal sensor data: synchronized 3D point clouds and stereo RGB video from a 128-channel 3D LiDAR and two 1.25MP RGB cameras at 10 fps; RGB-D videos from an additional 0.5MP sensor at 7 fps, and a 9-DOF IMU sensor at 40 Hz. We provide 58 minutes of ground-truth annotations containing 1.3 million 3D bounding boxes with instance IDs for 53 semantic classes, 5000 frames of 3D semantic annotations for urban terrain, and pseudo-ground truth localization. We repeatedly traverse identical geographic locations for a wide range of indoor and outdoor areas, weather conditions, and times of the day. Using CODa, we empirically demonstrate that: 1) 3D object detection performance in urban settings is significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
