DeepBox: Learning Objectness with Convolutional Networks
Weicheng Kuo, Bharath Hariharan, Jitendra Malik

TL;DR
DeepBox employs a CNN-based approach to rerank object proposals, significantly enhancing recall and detection accuracy while maintaining high speed, thus advancing objectness evaluation in computer vision.
Contribution
The paper introduces DeepBox, a CNN-based reranking framework that improves object proposal ranking efficiency and accuracy over traditional bottom-up methods.
Findings
Achieves the same recall with 500 proposals as bottom-up methods with 2000.
Generalizes well to unseen categories, improving detection mAP by 4.5 points.
Runs at 260 ms per image, suitable for real-time applications.
Abstract
Existing object proposal approaches use primarily bottom-up cues to rank proposals, while we believe that objectness is in fact a high level construct. We argue for a data-driven, semantic approach for ranking object proposals. Our framework, which we call DeepBox, uses convolutional neural networks (CNNs) to rerank proposals from a bottom-up method. We use a novel four-layer CNN architecture that is as good as much larger networks on the task of evaluating objectness while being much faster. We show that DeepBox significantly improves over the bottom-up ranking, achieving the same recall with 500 proposals as achieved by bottom-up methods with 2000. This improvement generalizes to categories the CNN has never seen before and leads to a 4.5-point gain in detection mAP. Our implementation achieves this performance while running at 260 ms per image.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
