Improving Fractal Pre-training

Connor Anderson; Ryan Farrell

arXiv:2110.03091·cs.CV·December 20, 2021

Improving Fractal Pre-training

Connor Anderson, Ryan Farrell

PDF

1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces a novel fractal-based pre-training method for neural networks in computer vision, offering a cost-effective, unbiased alternative to large image datasets with minimal performance loss.

Contribution

It proposes using dynamically-generated fractal images for pre-training, eliminating issues of data curation, privacy, and bias while maintaining high accuracy.

Findings

01

Fractal pre-training achieves 92.7-98.1% of ImageNet accuracy.

02

Fractal datasets are cost-free, unbiased, and limitless in diversity.

03

The method simplifies large-scale dataset challenges with minimal performance trade-offs.

Abstract

The deep neural networks used in modern computer vision systems require enormous image datasets to train them. These carefully-curated datasets typically have a million or more images, across a thousand or more distinct categories. The process of creating and curating such a dataset is a monumental undertaking, demanding extensive effort and labelling expense and necessitating careful navigation of technical and social issues such as label accuracy, copyright ownership, and content bias. What if we had a way to harness the power of large image datasets but with few or none of the major issues and concerns currently faced? This paper extends the recent work of Kataoka et. al. (2020), proposing an improved pre-training dataset based on dynamically-generated fractal images. Challenging issues with large-scale image datasets become points of elegance for fractal pre-training: perfect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

catalys1/fractal-pretraining
pytorch

Datasets

Mitsua/color-multi-fractal-db-1k
dataset· 336 dl
336 dl

Videos

Improving Fractal Pre-training· youtube