Webly Supervised Learning of Convolutional Networks

Xinlei Chen; Abhinav Gupta

arXiv:1505.01554·cs.CV·October 9, 2015·67 cites

Webly Supervised Learning of Convolutional Networks

Xinlei Chen, Abhinav Gupta

PDF

Open Access

TL;DR

This paper introduces a webly supervised learning approach for CNNs that leverages large-scale web data through a two-stage training process, outperforming traditional methods and demonstrating robustness to noisy data.

Contribution

The paper proposes a novel two-step curriculum-inspired training method for CNNs using web data, improving performance without relying on extensive labeled datasets.

Findings

01

Outperforms fine-tuned ImageNet CNN on Pascal VOC 2012

02

Achieves state-of-the-art results on VOC 2007 without using VOC training data

03

Robust to noisy web data, performing well with older image search results

Abstract

We present an approach to utilize large amounts of web data for learning CNNs. Specifically inspired by curriculum learning, we present a two-step approach for CNN training. First, we use easy images to train an initial visual representation. We then use this initial CNN and adapt it to harder, more realistic images by leveraging the structure of data and categories. We demonstrate that our two-stage CNN outperforms a fine-tuned CNN trained on ImageNet on Pascal VOC 2012. We also demonstrate the strength of webly supervised learning by localizing objects in web images and training a R-CNN style detector. It achieves the best performance on VOC 2007 where no VOC training data is used. Finally, we show our approach is quite robust to noise and performs comparably even when we use image search results from March 2013 (pre-CNN image search era).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Human Pose and Action Recognition · Advanced Neural Network Applications