Towards Robust Robot 3D Perception in Urban Environments: The UT Campus   Object Dataset

Arthur Zhang; Chaitanya Eranki; Christina Zhang; Ji-Hwan Park; Raymond; Hong; Pranav Kalyani; Lochana Kalyanaraman; Arsh Gamare; Arnav Bagad; Maria; Esteva; Joydeep Biswas

arXiv:2309.13549·cs.RO·October 3, 2023

Towards Robust Robot 3D Perception in Urban Environments: The UT Campus Object Dataset

Arthur Zhang, Chaitanya Eranki, Christina Zhang, Ji-Hwan Park, Raymond, Hong, Pranav Kalyani, Lochana Kalyanaraman, Arsh Gamare, Arnav Bagad, Maria, Esteva, Joydeep Biswas

PDF

3 Repos

TL;DR

The UT Campus Object Dataset (CODa) provides a comprehensive multimodal dataset for urban robot perception, enabling improved 3D object detection and semantic segmentation through extensive annotations and diverse environmental conditions.

Contribution

This paper introduces CODa, a new large-scale egocentric perception dataset with multimodal sensor data, extensive annotations, and benchmarks for urban robot perception tasks.

Findings

01

Training on CODa enhances 3D detection performance in urban settings.

02

Sensor-specific fine-tuning improves detection accuracy.

03

Pretraining on CODa outperforms AV datasets for cross-dataset detection.

Abstract

We introduce the UT Campus Object Dataset (CODa), a mobile robot egocentric perception dataset collected on the University of Texas Austin Campus. Our dataset contains 8.5 hours of multimodal sensor data: synchronized 3D point clouds and stereo RGB video from a 128-channel 3D LiDAR and two 1.25MP RGB cameras at 10 fps; RGB-D videos from an additional 0.5MP sensor at 7 fps, and a 9-DOF IMU sensor at 40 Hz. We provide 58 minutes of ground-truth annotations containing 1.3 million 3D bounding boxes with instance IDs for 53 semantic classes, 5000 frames of 3D semantic annotations for urban terrain, and pseudo-ground truth localization. We repeatedly traverse identical geographic locations for a wide range of indoor and outdoor areas, weather conditions, and times of the day. Using CODa, we empirically demonstrate that: 1) 3D object detection performance in urban settings is significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.