TL;DR
This paper investigates large-scale weakly supervised pretraining using billions of social media images, demonstrating significant improvements in transfer learning and achieving state-of-the-art accuracy on ImageNet-1k.
Contribution
It provides the first extensive empirical analysis of large-scale hashtag-based pretraining and its impact on transfer learning performance across multiple vision tasks.
Findings
Achieved 85.4% top-1 accuracy on ImageNet-1k.
Large-scale hashtag pretraining outperforms traditional supervised methods.
Provided new insights into the relationship between dataset size and transfer learning effectiveness.
Abstract
State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models. Yet, ImageNet is now nearly ten years old and is by modern standards "small". Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger. The reasons are obvious: such datasets are difficult to collect and annotate. In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. Our experiments demonstrate that training for large-scale hashtag prediction leads to excellent results. We show improvements on several image classification and object detection tasks, and report the highest ImageNet-1k single-crop, top-1 accuracy to date: 85.4% (97.6% top-5).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAverage Pooling · ResNeXt Block · Grouped Convolution · Global Average Pooling · Kaiming Initialization · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Convolution · Random Horizontal Flip
