Early Stopping without a Validation Set

Maren Mahsereci; Lukas Balles; Christoph Lassner; Philipp Hennig

arXiv:1703.09580·cs.LG·June 7, 2017·27 cites

Early Stopping without a Validation Set

Maren Mahsereci, Lukas Balles, Christoph Lassner, Philipp Hennig

PDF

Open Access

TL;DR

This paper introduces a new early stopping method that eliminates the need for a validation set by using local gradient statistics, applicable to various models including neural networks.

Contribution

It proposes a novel early stopping criterion based on gradient statistics that removes the reliance on a validation set for model training.

Findings

01

Effective in least-squares and logistic regression

02

Works well with neural networks

03

Achieves comparable performance without validation set

Abstract

Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. We propose a novel early stopping criterion based on fast-to-compute local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression, as well as neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning