Strategies and impact of learning curve estimation for CNN-based image   classification

Laura Didyk; Brayden Yarish; Michael A. Beck; Christopher P.; Bidinosti; Christopher J. Henry

arXiv:2310.08470·cs.LG·October 13, 2023·1 cites

Strategies and impact of learning curve estimation for CNN-based image classification

Laura Didyk, Brayden Yarish, Michael A. Beck, Christopher P., Bidinosti, Christopher J. Henry

PDF

Open Access

TL;DR

This paper explores strategies for efficiently estimating learning curves of CNN models in image classification, aiming to reduce training time while maintaining accurate performance predictions.

Contribution

It formulates a framework for sampling strategies to estimate learning curves efficiently and evaluates these methods on popular datasets and models.

Findings

01

Power law behavior of learning curves enables performance prediction.

02

Proposed strategies reduce training time for model selection.

03

Evaluation shows strategies maintain accuracy in learning curve estimation.

Abstract

Learning curves are a measure for how the performance of machine learning models improves given a certain volume of training data. Over a wide variety of applications and models it was observed that learning curves follow -- to a large extent -- a power law behavior. This makes the performance of different models for a given task somewhat predictable and opens the opportunity to reduce the training time for practitioners, who are exploring the space of possible models and hyperparameters for the problem at hand. By estimating the learning curve of a model from training on small subsets of data only the best models need to be considered for training on the full dataset. How to choose subset sizes and how often to sample models on these to obtain estimates is however not researched. Given that the goal is to reduce overall training time strategies are needed that sample the performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Neural Networks and Applications