MonoNext: A 3D Monocular Object Detection with ConvNext

Marcelo Eduardo Pederiva; Jos\'e Mario De Martino; Alessandro; Zimmer

arXiv:2308.00596·cs.CV·August 2, 2023·2 cites

MonoNext: A 3D Monocular Object Detection with ConvNext

Marcelo Eduardo Pederiva, Jos\'e Mario De Martino, Alessandro, Zimmer

PDF

Open Access

TL;DR

MonoNext introduces a ConvNext-based monocular 3D object detection model that uses a spatial grid and multi-task learning, achieving high accuracy with less computational cost on the KITTI dataset.

Contribution

The paper presents MonoNext, a novel multi-task learning approach utilizing ConvNext for monocular 3D detection with a simple spatial grid, requiring only 3D bounding box annotations.

Findings

01

MonoNext achieves high precision on KITTI dataset.

02

Adding more training data improves MonoNext's accuracy.

03

Performance is comparable to state-of-the-art methods.

Abstract

Autonomous driving perception tasks rely heavily on cameras as the primary sensor for Object Detection, Semantic Segmentation, Instance Segmentation, and Object Tracking. However, RGB images captured by cameras lack depth information, which poses a significant challenge in 3D detection tasks. To supplement this missing data, mapping sensors such as LIDAR and RADAR are used for accurate 3D Object Detection. Despite their significant accuracy, the multi-sensor models are expensive and require a high computational demand. In contrast, Monocular 3D Object Detection models are becoming increasingly popular, offering a faster, cheaper, and easier-to-implement solution for 3D detections. This paper introduces a different Multi-Tasking Learning approach called MonoNext that utilizes a spatial grid to map objects in the scene. MonoNext employs a straightforward approach based on the ConvNext…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods

MethodsConvNeXt