Pre-training Vision Transformers with Very Limited Synthesized Images
Ryo Nakamura, Hirokatsu Kataoka, Sora Takashima, Edgar Josafat, Martinez Noriega, Rio Yokota, Nakamasa Inoue

TL;DR
This paper demonstrates that pre-training vision transformers with a single synthetic image per category, using a fractal-based dataset, can outperform traditional large-scale datasets like ImageNet-21k in downstream tasks, highlighting efficient pre-training methods.
Contribution
The authors introduce OFDB, a minimal synthetic dataset with one image per category, showing it can effectively pre-train vision transformers and surpass large datasets in performance.
Findings
OFDB outperforms original FDSL datasets in downstream tasks.
Pre-training with OFDB matches or exceeds ImageNet-21k performance.
Small synthetic datasets can be highly effective for vision transformer pre-training.
Abstract
Formula-driven supervised learning (FDSL) is a pre-training method that relies on synthetic images generated from mathematical formulae such as fractals. Prior work on FDSL has shown that pre-training vision transformers on such synthetic datasets can yield competitive accuracy on a wide range of downstream tasks. These synthetic images are categorized according to the parameters in the mathematical formula that generate them. In the present work, we hypothesize that the process for generating different instances for the same category in FDSL, can be viewed as a form of data augmentation. We validate this hypothesis by replacing the instances with data augmentation, which means we only need a single image per category. Our experiments shows that this one-instance fractal database (OFDB) performs better than the original dataset where instances were explicitly generated. We further scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Fractal and DNA sequence analysis · Neural Networks and Applications
