Improving Fractal Pre-training
Connor Anderson, Ryan Farrell

TL;DR
This paper introduces a novel fractal-based pre-training method for neural networks in computer vision, offering a cost-effective, unbiased alternative to large image datasets with minimal performance loss.
Contribution
It proposes using dynamically-generated fractal images for pre-training, eliminating issues of data curation, privacy, and bias while maintaining high accuracy.
Findings
Fractal pre-training achieves 92.7-98.1% of ImageNet accuracy.
Fractal datasets are cost-free, unbiased, and limitless in diversity.
The method simplifies large-scale dataset challenges with minimal performance trade-offs.
Abstract
The deep neural networks used in modern computer vision systems require enormous image datasets to train them. These carefully-curated datasets typically have a million or more images, across a thousand or more distinct categories. The process of creating and curating such a dataset is a monumental undertaking, demanding extensive effort and labelling expense and necessitating careful navigation of technical and social issues such as label accuracy, copyright ownership, and content bias. What if we had a way to harness the power of large image datasets but with few or none of the major issues and concerns currently faced? This paper extends the recent work of Kataoka et. al. (2020), proposing an improved pre-training dataset based on dynamically-generated fractal images. Challenging issues with large-scale image datasets become points of elegance for fractal pre-training: perfect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Improving Fractal Pre-training· youtube
