FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

Tai Wang; Xinge Zhu; Jiangmiao Pang; Dahua Lin

arXiv:2104.10956·cs.CV·September 27, 2021·27 cites

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

PDF

Open Access 5 Repos

TL;DR

FCOS3D introduces a fully convolutional, single-stage framework for monocular 3D object detection that effectively leverages 2D detection advances to address depth ambiguity, achieving top results in the nuScenes challenge.

Contribution

This work presents a novel general framework that transforms 3D detection into a 2D problem with decoupled attributes, eliminating the need for 2D-3D priors and improving detection performance.

Findings

01

Achieved 1st place in nuScenes 3D detection challenge.

02

Effectively decouples 3D targets into 2D and 3D attributes.

03

Redefines center-ness using a 2D Gaussian based on 3D centers.

Abstract

Monocular 3D object detection is an important task for autonomous driving considering its advantage of low cost. It is much more challenging than conventional 2D cases due to its inherent ill-posed property, which is mainly reflected in the lack of depth information. Recent progress on 2D detection offers opportunities to better solving this problem. However, it is non-trivial to make a general adapted 2D detector work in this 3D task. In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D. Specifically, we first transform the commonly defined 7-DoF 3D targets to the image domain and decouple them as 2D and 3D attributes. Then the objects are distributed to different feature levels with consideration of their 2D scales and assigned only according to the projected 3D-center for the training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques