Deep CNN Ensemble with Data Augmentation for Object Detection
Jian Guo, Stephen Gould

TL;DR
This paper presents a deep CNN ensemble approach with data augmentation that achieves state-of-the-art object detection performance on PASCAL VOC 2012 by combining diverse models and enlarging training data with COCO images.
Contribution
Introduces a novel ensemble of different CNN architectures and uses selective COCO data augmentation to improve object detection accuracy.
Findings
Outperforms all previous methods on PASCAL VOC 2012
Ensemble of diverse CNN architectures enhances detection performance
Data augmentation with COCO images significantly increases training effectiveness
Abstract
We report on the methods used in our recent DeepEnsembleCoco submission to the PASCAL VOC 2012 challenge, which achieves state-of-the-art performance on the object detection task. Our method is a variant of the R-CNN model proposed Girshick:CVPR14 with two key improvements to training and evaluation. First, our method constructs an ensemble of deep CNN models with different architectures that are complementary to each other. Second, we augment the PASCAL VOC training set with images from the Microsoft COCO dataset to significantly enlarge the amount training data. Importantly, we select a subset of the Microsoft COCO images to be consistent with the PASCAL VOC task. Results on the PASCAL VOC evaluation server show that our proposed method outperform all previous methods on the PASCAL VOC 2012 detection task at time of submission.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsSupport Vector Machine · Max Pooling · Convolution · R-CNN
