How much human-like visual experience do current self-supervised   learning algorithms need in order to achieve human-level object recognition?

A. Emin Orhan

arXiv:2109.11523·cs.CV·May 25, 2022

How much human-like visual experience do current self-supervised learning algorithms need in order to achieve human-level object recognition?

A. Emin Orhan

PDF

Open Access 1 Repo

TL;DR

This study estimates that current self-supervised visual learning algorithms require vastly more natural visual experience than humans to reach human-level object recognition, highlighting significant gaps in data efficiency.

Contribution

The paper provides the first quantitative estimates of the amount of natural visual experience needed for algorithms to match human performance, revealing it is orders of magnitude greater than a human lifetime.

Findings

01

Algorithms need millions to billions of years of visual experience to reach human-level performance.

02

Estimated experience requirements are much larger for robustness benchmarks.

03

Results are sensitive to underlying assumptions but remain significantly above human lifetime.

Abstract

This paper addresses a fundamental question: how good are our current self-supervised visual representation learning algorithms relative to humans? More concretely, how much "human-like" natural visual experience would these algorithms need in order to reach human-level performance in a complex, realistic visual object recognition task such as ImageNet? Using a scaling experiment, here we estimate that the answer is several orders of magnitude longer than a human lifetime: typically on the order of a million to a billion years of natural visual experience (depending on the algorithm used). We obtain even larger estimates for achieving human-level performance in ImageNet-derived robustness benchmarks. The exact values of these estimates are sensitive to some underlying assumptions, however even in the most optimistic scenarios they remain orders of magnitude larger than a human lifetime.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eminorhan/human-ssl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Remote-Sensing Image Classification · Image Processing Techniques and Applications