CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction

Zhangchen Ye; Tao Jiang; Chenfeng Xu; Yiming Li; Hang Zhao

arXiv:2409.13430·cs.CV·September 26, 2024

CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction

Zhangchen Ye, Tao Jiang, Chenfeng Xu, Yiming Li, Hang Zhao

PDF

Open Access 1 Repo

TL;DR

CVT-Occ introduces a temporal fusion method leveraging historical voxel correspondence to enhance 3D occupancy prediction accuracy from monocular vision, outperforming state-of-the-art methods with minimal extra computation.

Contribution

It presents CVT-Occ, a novel cost volume-based approach that uses temporal voxel correspondence and feature integration to improve 3D occupancy prediction.

Findings

01

Outperforms state-of-the-art methods on Occ3D-Waymo dataset

02

Utilizes temporal information for improved accuracy

03

Maintains minimal additional computational cost

Abstract

Vision-based 3D occupancy prediction is significantly challenged by the inherent limitations of monocular vision in depth estimation. This paper introduces CVT-Occ, a novel approach that leverages temporal fusion through the geometric correspondence of voxels over time to improve the accuracy of 3D occupancy predictions. By sampling points along the line of sight of each voxel and integrating the features of these points from historical frames, we construct a cost volume feature map that refines current volume features for improved prediction outcomes. Our method takes advantage of parallax cues from historical observations and employs a data-driven approach to learn the cost volume. We validate the effectiveness of CVT-Occ through rigorous experiments on the Occ3D-Waymo dataset, where it outperforms state-of-the-art methods in 3D occupancy prediction with minimal additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Tsinghua-MARS-Lab/CVT-Occ
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques