DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection
Wanli Ouyang, Ping Luo, Xingyu Zeng, Shi Qiu, Yonglong Tian, Hongsheng, Li, Shuo Yang, Zhe Wang, Yuanjun Xiong, Chen Qian, Zhenyao Zhu, Ruohui Wang,, Chen-Change Loy, Xiaogang Wang, Xiaoou Tang

TL;DR
This paper introduces a multi-stage, deformable deep convolutional neural network architecture with innovative components like def-pooling and a new training strategy, significantly improving object detection accuracy.
Contribution
It presents a novel deep learning framework with deformable pooling, multi-stage training, and new pre-training strategies, enhancing object detection performance over previous methods.
Findings
Achieved 45% mean average precision on ILSVRC 2014
Ranked #2 in ILSVRC 2014 object detection challenge
Significantly outperformed RCNN in accuracy
Abstract
In this paper, we propose multi-stage and deformable deep convolutional neural networks for object detection. This new deep learning object detection diagram has innovations in multiple aspects. In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty. With the proposed multi-stage training strategy, multiple classifiers are jointly optimized to process samples at different difficulty levels. A new pre-training strategy is proposed to learn feature representations more suitable for the object detection task and with good generalization capability. By changing the net structures, training strategies, adding and removing some key components in the detection pipeline, a set of models with large diversity are obtained, which significantly improves the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Image and Video Retrieval Techniques
