OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving
Heng Li, Yifan Duan, Xinran Zhang, Haiyi Liu, Jianmin Ji, Yanyong, Zhang

TL;DR
OCC-VO introduces a deep learning-based framework that converts 2D camera images into 3D semantic occupancy maps, enabling accurate visual odometry and mapping for autonomous driving without traditional depth estimation.
Contribution
The paper presents OCC-VO, a novel 3D occupancy-based visual odometry framework that leverages deep learning and semantic filtering to improve autonomous driving mapping accuracy.
Findings
20.6% improvement in Success Ratio
29.6% better trajectory accuracy
Effective 3D semantic occupancy mapping
Abstract
Visual Odometry (VO) plays a pivotal role in autonomous systems, with a principal challenge being the lack of depth information in camera images. This paper introduces OCC-VO, a novel framework that capitalizes on recent advances in deep learning to transform 2D camera images into 3D semantic occupancy, thereby circumventing the traditional need for concurrent estimation of ego poses and landmark locations. Within this framework, we utilize the TPV-Former to convert surround view cameras' images into 3D semantic occupancy. Addressing the challenges presented by this transformation, we have specifically tailored a pose estimation and mapping algorithm that incorporates Semantic Label Filter, Dynamic Object Filter, and finally, utilizes Voxel PFilter for maintaining a consistent global semantic map. Evaluations on the Occ3D-nuScenes not only showcase a 20.6% improvement in Success Ratio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
