Crafting GBD-Net for Object Detection
Xingyu Zeng, Wanli Ouyang, Junjie Yan, Hongsheng Li, Tong, Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang and, Hui Zhou, Xiaogang Wang

TL;DR
This paper introduces GBD-Net, a gated bi-directional CNN that enhances object detection by effectively integrating local and contextual visual cues through message passing and gating mechanisms, demonstrating superior performance on multiple datasets.
Contribution
The paper proposes GBD-Net, a novel CNN architecture with message passing and gating to better combine multi-region features for object detection.
Findings
Improved detection accuracy on ImageNet, Pascal VOC2007, and MS COCO datasets.
Demonstrated winning the 2016 ImageNet object detection challenge.
Showed that gated message passing enhances feature integration in object detection.
Abstract
The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
MethodsConvolution
