MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation
Jonas Heylen, Mark De Wolf, Bruno Dawagne, Marc Proesmans, Luc Van, Gool, Wim Abbeloos, Hazem Abdelkawy, Daniel Olmeda Reino

TL;DR
This paper introduces MonoCInIS, a monocular 3D object detection method that is invariant to camera parameters, enabling effective training on heterogeneous datasets and outperforming existing camera-dependent approaches.
Contribution
We propose a camera-independent instance segmentation approach for monocular 3D detection that leverages geometric reasoning, improving performance across diverse datasets.
Findings
Outperforms camera-dependent methods on KITTI3D benchmark
Demonstrates effective use of heterogeneous datasets
Shows that camera invariance enhances generalization
Abstract
Monocular 3D object detection has recently shown promising results, however there remain challenging problems. One of those is the lack of invariance to different camera intrinsic parameters, which can be observed across different 3D object datasets. Little effort has been made to exploit the combination of heterogeneous 3D object datasets. In contrast to general intuition, we show that more data does not automatically guarantee a better performance, but rather, methods need to have a degree of 'camera independence' in order to benefit from large and heterogeneous training data. In this paper we propose a category-level pose estimation method based on instance segmentation, using camera independent geometric reasoning to cope with the varying camera viewpoints and intrinsics of different datasets. Every pixel of an instance predicts the object dimensions, the 3D object reference points…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
