Does depth estimation help object detection?
Bedrettin Cetinkaya, Sinan Kalkan, Emre Akbas

TL;DR
This paper investigates how the use of estimated depth information influences object detection performance, analyzing various factors and proposing an efficient integration strategy that improves accuracy with fewer parameters.
Contribution
It provides a comprehensive analysis of factors affecting depth-based object detection and introduces an early concatenation method that enhances performance efficiently.
Findings
Ground-truth depth improves detection accuracy over color-only models.
Estimated depth does not always improve detection, depending on various factors.
Early concatenation of depth features outperforms previous methods with fewer parameters.
Abstract
Ground-truth depth, when combined with color data, helps improve object detection accuracy over baseline models that only use color. However, estimated depth does not always yield improvements. Many factors affect the performance of object detection when estimated depth is used. In this paper, we comprehensively investigate these factors with detailed experiments, such as using ground-truth vs. estimated depth, effects of different state-of-the-art depth estimation networks, effects of using different indoor and outdoor RGB-D datasets as training data for depth estimation, and different architectural choices for integrating depth to the base object detector network. We propose an early concatenation strategy of depth, which yields higher mAP than previous works' while using significantly fewer parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques
MethodsBalanced Selection
