Learning to Rank Learning Curves

Martin Wistuba; Tejaswini Pedapati

arXiv:2006.03361·cs.LG·June 8, 2020·5 cites

Learning to Rank Learning Curves

Martin Wistuba, Tejaswini Pedapati

PDF

Open Access 1 Video

TL;DR

This paper introduces a ranking-based transfer learning approach to early stopping in machine learning training, significantly reducing computational costs while maintaining performance.

Contribution

It proposes a novel pairwise ranking loss method that leverages learning curves from other datasets to efficiently predict and terminate poor configurations early.

Findings

01

Achieves up to 100x speedup in neural architecture search

02

Effectively ranks learning curves with limited observations

03

Maintains high-quality model selection despite early stopping

Abstract

Many automated machine learning methods, such as those for hyperparameter and neural architecture optimization, are computationally expensive because they involve training many different model configurations. In this work, we present a new method that saves computational budget by terminating poor configurations early on in the training. In contrast to existing methods, we consider this task as a ranking and transfer learning problem. We qualitatively show that by optimizing a pairwise ranking loss and leveraging learning curves from other datasets, our model is able to effectively rank learning curves without having to observe many or very long learning curves. We further demonstrate that our method can be used to accelerate a neural architecture search by a factor of up to 100 without a significant performance degradation of the discovered architecture. In further experiments we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning to Rank Learning Curves· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Advanced Neural Network Applications

MethodsSigmoid Activation · Softmax · Tanh Activation · Long Short-Term Memory