Vision Models Are More Robust And Fair When Pretrained On Uncurated   Images Without Supervision

Priya Goyal; Quentin Duval; Isaac Seessel; Mathilde Caron; Ishan; Misra; Levent Sagun; Armand Joulin; Piotr Bojanowski

arXiv:2202.08360·cs.CV·February 23, 2022·48 cites

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan, Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

PDF

Open Access 1 Repo 10 Models 2 Videos

TL;DR

This paper demonstrates that large-scale self-supervised training on uncurated images enhances model robustness, fairness, and bias reduction, capturing diverse semantic and stylistic information without supervision.

Contribution

It introduces a method of training massive models on uncurated images without supervision, leading to more robust and fair models that learn diverse, salient visual information.

Findings

01

Models trained on uncurated images outperform supervised models in fairness and robustness.

02

The approach captures artistic styles, geolocations, and multilingual embeddings from visual content.

03

The resulting models are less biased and more equitable across diverse benchmarks.

Abstract

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images. Applied to ImageNet, this leads to object centric features that perform on par with supervised features on most object-centric downstream tasks. In this work, we question if using this ability, we can learn any salient and more representative information present in diverse unbounded set of images from across the globe. To do so, we train models on billions of random images without any data pre-processing or prior assumptions about what we want the model to learn. We scale our model size to dense 10 billion parameters to avoid underfitting on a large data size. We extensively study and validate our model performance on over 50 benchmarks including fairness, robustness to distribution shift,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/vissl
pytorchOfficial

Models

Videos

[ML News] DeepMind controls fusion | Yann LeCun's JEPA architecture | US: AI can't copyright its art· youtube

SEER explained: Vision Models more Robust & Fair when pretrained on UNCURATED images!?· youtube

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection · Face recognition and analysis