Revisiting Weakly Supervised Pre-Training of Visual Perception Models

Mannat Singh; Laura Gustafson; Aaron Adcock; Vinicius de Freitas Reis,; Bugra Gedik; Raj Prateek Kosaraju; Dhruv Mahajan; Ross Girshick; Piotr; Doll\'ar; Laurens van der Maaten

arXiv:2201.08371·cs.CV·April 5, 2022·5 cites

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

Mannat Singh, Laura Gustafson, Aaron Adcock, Vinicius de Freitas Reis,, Bugra Gedik, Raj Prateek Kosaraju, Dhruv Mahajan, Ross Girshick, Piotr, Doll\'ar, Laurens van der Maaten

PDF

Open Access 2 Repos 10 Models

TL;DR

This paper demonstrates that weakly supervised pre-training using hashtags can outperform self-supervised methods in visual recognition tasks, offering a promising alternative to traditional fully supervised approaches.

Contribution

It introduces SWAG, a weakly supervised pre-training method using hashtags, and shows its competitive performance against self-supervised models across various transfer-learning settings.

Findings

01

Weakly supervised models outperform self-supervised counterparts.

02

SWAG achieves strong transfer-learning performance.

03

Models do not learn harmful stereotypes.

Abstract

Model pre-training is a cornerstone of modern visual recognition systems. Although fully supervised pre-training on datasets like ImageNet is still the de-facto standard, recent studies suggest that large-scale weakly supervised pre-training can outperform fully supervised approaches. This paper revisits weakly-supervised pre-training of models using hashtag supervision with modern versions of residual networks and the largest-ever dataset of images and corresponding hashtags. We study the performance of the resulting models in various transfer-learning settings including zero-shot transfer. We also compare our models with those obtained via large-scale self-supervised learning. We find our weakly-supervised models to be very competitive across all settings, and find they substantially outperform their self-supervised counterparts. We also include an investigation into whether our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques