Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular   3D Object Detection

Zhuoling Li; Zhan Qu; Yang Zhou; Jianzhuang Liu; Haoqian Wang; Lihui; Jiang

arXiv:2205.09373·cs.CV·May 20, 2022·1 cites

Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection

Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui, Jiang

PDF

Open Access

TL;DR

This paper introduces a novel depth solving system for monocular 3D object detection that leverages multiple depth estimations from diverse assumptions, improving robustness and accuracy without extra data.

Contribution

It proposes a depth estimation approach that generates multiple hypotheses from different assumptions and adaptively combines them for more reliable monocular 3D detection.

Findings

01

Surpasses the current best method by over 20% on KITTI benchmark

02

Achieves higher robustness by exploiting diverse depth clues

03

Maintains real-time efficiency

Abstract

As an inherently ill-posed problem, depth estimation from single images is the most challenging part of monocular 3D object detection (M3OD). Many existing methods rely on preconceived assumptions to bridge the missing spatial information in monocular images, and predict a sole depth value for every object of interest. However, these assumptions do not always hold in practical applications. To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target. Since the depth estimations rely on different assumptions in essence, they present diverse distributions. Even if some assumptions collapse, the estimations established on the remaining assumptions are still reliable. In addition, we develop a depth selection and combination strategy. This strategy is able to remove…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection