Perspective-aware Convolution for Monocular 3D Object Detection

Jia-Quan Yu; Soo-Chang Pei

arXiv:2308.12938·cs.CV·August 25, 2023·1 cites

Perspective-aware Convolution for Monocular 3D Object Detection

Jia-Quan Yu, Soo-Chang Pei

PDF

Open Access 1 Repo

TL;DR

This paper introduces a perspective-aware convolutional layer that captures depth-related features in images, improving monocular 3D object detection accuracy for autonomous driving.

Contribution

It proposes a novel convolutional layer that encodes perspective information, enhancing feature extraction for monocular 3D detection tasks.

Findings

01

Achieved 23.9% average precision on KITTI3D easy benchmark

02

Improved depth inference by modeling scene perspective

03

Enhanced 3D detection accuracy with the new convolutional layer

Abstract

Monocular 3D object detection is a crucial and challenging task for autonomous driving vehicle, while it uses only a single camera image to infer 3D objects in the scene. To address the difficulty of predicting depth using only pictorial clue, we propose a novel perspective-aware convolutional layer that captures long-range dependencies in images. By enforcing convolutional kernels to extract features along the depth axis of every image pixel, we incorporates perspective information into network architecture. We integrate our perspective-aware convolutional layer into a 3D object detector and demonstrate improved performance on the KITTI3D dataset, achieving a 23.9\% average precision in the easy benchmark. These results underscore the importance of modeling scene clues for accurate depth inference and highlight the benefits of incorporating scene structure in network design. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KenYu910645/perspective-aware-convolution
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods