FFCV: Accelerating Training by Removing Data Bottlenecks

Guillaume Leclerc; Andrew Ilyas; Logan Engstrom; Sung Min Park; Hadi; Salman; Aleksander Madry

arXiv:2306.12517·cs.LG·June 23, 2023·2 cites

FFCV: Accelerating Training by Removing Data Bottlenecks

Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi, Salman, Aleksander Madry

PDF

Open Access 2 Repos

TL;DR

FFCV is a library that significantly accelerates machine learning training by optimizing data loading and transfer, enabling faster training times without sacrificing accuracy.

Contribution

The paper introduces FFCV, a novel library that combines multiple techniques to eliminate data bottlenecks and improve GPU utilization during training.

Findings

01

Training ResNet-50 on ImageNet to 75% accuracy in 20 minutes.

02

Achieves efficient GPU utilization and faster training times.

03

Demonstrates ease of use and adaptability across different resource constraints.

Abstract

We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly more efficient, ensuring that GPUs can reach full utilization; and (b) offload as much data processing as possible to the CPU asynchronously, freeing GPU cycles for training. Using FFCV, we train ResNet-18 and ResNet-50 on the ImageNet dataset with competitive tradeoff between accuracy and training time. For example, we are able to train an ImageNet ResNet-50 model to 75\% in only 20 mins on a single machine. We demonstrate FFCV's performance, ease-of-use, extensibility, and ability to adapt to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Machine Learning in Healthcare

MethodsLib