VPOcc: Exploiting Vanishing Point for 3D Semantic Occupancy Prediction
Junsu Kim, Junhee Lee, Ukcheol Shin, Jean Oh, Kyungdon Joo

TL;DR
VPOcc introduces a novel framework that leverages vanishing point information to improve 3D semantic occupancy prediction from 2D images, addressing perspective-induced scale discrepancies for better scene understanding.
Contribution
The paper proposes VPOcc, a framework that uses vanishing point-based modules for perspective-aware image warping and feature aggregation, enhancing 3D scene prediction accuracy.
Findings
Achieved higher IoU and mIoU on SemanticKITTI and SSCBench-KITTI360 datasets.
Demonstrated effectiveness of vanishing point-guided modules in mitigating perspective distortion.
Improved 3D semantic occupancy prediction accuracy over baseline methods.
Abstract
Understanding 3D scenes semantically and spatially is crucial for the safe navigation of robots and autonomous vehicles, aiding obstacle avoidance and accurate trajectory planning. Camera-based 3D semantic occupancy prediction, which infers complete voxel grids from 2D images, is gaining importance in robot vision for its resource efficiency compared to 3D sensors. However, this task inherently suffers from a 2D-3D discrepancy, where objects of the same size in 3D space appear at different scales in a 2D image depending on their distance from the camera due to perspective projection. To tackle this issue, we propose a novel framework called VPOcc that leverages a vanishing point (VP) to mitigate the 2D-3D discrepancy at both the pixel and feature levels. As a pixel-level solution, we introduce a VPZoomer module, which warps images by counteracting the perspective effect using a VP-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · AI in cancer detection · Video Analysis and Summarization
