Universal representations:The missing link between faces, text, planktons, and cat breeds
Hakan Bilen, Andrea Vedaldi

TL;DR
This paper investigates whether neural networks can serve as universal visual representations across diverse domains, demonstrating that with proper normalization, a single model can learn multiple vision tasks effectively, akin to human vision.
Contribution
The study shows that neural networks can learn multiple diverse visual domains simultaneously, highlighting the importance of normalization techniques for universal representation.
Findings
A single neural network can learn various visual domains effectively.
Normalization techniques like domain-specific scaling or instance normalization are crucial.
Neural networks can match or surpass specialized models in multi-domain learning.
Abstract
With the advent of large labelled datasets and high-capacity models, the performance of machine vision systems has been improving rapidly. However, the technology has still major limitations, starting from the fact that different vision problems are still solved by different models, trained from scratch or fine-tuned on the target data. The human visual system, in stark contrast, learns a universal representation for vision in the early life of an individual. This representation works well for an enormous variety of vision problems, with little or no change, with the major advantage of requiring little training data to solve any of them. In this paper we investigate whether neural networks may work as universal representations by studying their capacity in relation to the âsizeâ of a large combination of vision problems. We do so by showing that a single neural network can learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Domain Adaptation and Few-Shot Learning
