Deep CNN Ensemble with Data Augmentation for Object Detection

Jian Guo; Stephen Gould

arXiv:1506.07224·cs.CV·June 25, 2015·46 cites

Deep CNN Ensemble with Data Augmentation for Object Detection

Jian Guo, Stephen Gould

PDF

Open Access

TL;DR

This paper presents a deep CNN ensemble approach with data augmentation that achieves state-of-the-art object detection performance on PASCAL VOC 2012 by combining diverse models and enlarging training data with COCO images.

Contribution

Introduces a novel ensemble of different CNN architectures and uses selective COCO data augmentation to improve object detection accuracy.

Findings

01

Outperforms all previous methods on PASCAL VOC 2012

02

Ensemble of diverse CNN architectures enhances detection performance

03

Data augmentation with COCO images significantly increases training effectiveness

Abstract

We report on the methods used in our recent DeepEnsembleCoco submission to the PASCAL VOC 2012 challenge, which achieves state-of-the-art performance on the object detection task. Our method is a variant of the R-CNN model proposed Girshick:CVPR14 with two key improvements to training and evaluation. First, our method constructs an ensemble of deep CNN models with different architectures that are complementary to each other. Second, we augment the PASCAL VOC training set with images from the Microsoft COCO dataset to significantly enlarge the amount training data. Importantly, we select a subset of the Microsoft COCO images to be consistent with the PASCAL VOC task. Results on the PASCAL VOC evaluation server show that our proposed method outperform all previous methods on the PASCAL VOC 2012 detection task at time of submission.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsSupport Vector Machine · Max Pooling · Convolution · R-CNN