Deeply Exploit Depth Information for Object Detection
Saihui Hou, Zilei Wang, Feng Wu

TL;DR
This paper introduces a two-stage CNN framework that leverages depth information to derive and fuse visual properties, significantly improving RGB-D object detection performance.
Contribution
It proposes a novel property derivation and fusion approach that enhances object detection by effectively integrating depth-derived features within CNNs.
Findings
Achieved state-of-the-art results on a challenging dataset.
Demonstrated the effectiveness of separate property learning before fusion.
Validated the biological plausibility of the detection mechanism.
Abstract
This paper addresses the issue on how to more effectively coordinate the depth with RGB aiming at boosting the performance of RGB-D object detection. Particularly, we investigate two primary ideas under the CNN model: property derivation and property fusion. Firstly, we propose that the depth can be utilized not only as a type of extra information besides RGB but also to derive more visual properties for comprehensively describing the objects of interest. So a two-stage learning framework consisting of property derivation and fusion is constructed. Here the properties can be derived either from the provided color/depth or their pairs (e.g. the geometry contour adopted in this paper). Secondly, we explore the fusion method of different properties in feature learning, which is boiled down to, under the CNN model, from which layer the properties should be fused together. The analysis shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Industrial Vision Systems and Defect Detection
