The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano, Aaqib Saeed

TL;DR
This paper demonstrates that neural networks can learn meaningful visual representations from a single image using augmentation and knowledge distillation, achieving surprisingly high accuracy across multiple datasets and modalities.
Contribution
It introduces a framework for training neural networks from a single image with augmentations and knowledge distillation, revealing the power of the augmented image prior.
Findings
Achieves 94% accuracy on CIFAR-10 with a single image.
Attains 69% accuracy on ImageNet from one image.
Extends the approach successfully to video and audio data.
Abstract
What can neural networks learn about the visual world when provided with only a single image as input? While any image obviously cannot contain the multitudes of all existing objects, scenes and lighting conditions - within the space of all 256^(3x224x224) possible 224-sized square images, it might still provide a strong prior for natural images. To analyze this `augmented image prior' hypothesis, we develop a simple framework for training neural networks from scratch using a single image and augmentations using knowledge distillation from a supervised pretrained teacher. With this, we find the answer to the above question to be: `surprisingly, a lot'. In quantitative terms, we find accuracies of 94%/74% on CIFAR-10/100, 69% on ImageNet, and by extending this method to video and audio, 51% on Kinetics-400 and 84% on SpeechCommands. In extensive analyses spanning 13 datasets, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image Processing Techniques · Advanced Neural Network Applications
MethodsKnowledge Distillation
