Identifying Training Stop Point with Noisy Labeled Data

Sree Ram Kamabattula; Venkat Devarajan; Babak Namazi; Ganesh; Sankaranarayanan

arXiv:2012.13435·cs.LG·July 8, 2021

Identifying Training Stop Point with Noisy Labeled Data

Sree Ram Kamabattula, Venkat Devarajan, Babak Namazi, Ganesh, Sankaranarayanan

PDF

TL;DR

This paper introduces AutoTSP, a novel method that automatically determines the optimal training stopping point for deep neural networks trained on noisy data, solely based on training behavior without requiring clean validation data or noise estimation.

Contribution

The paper proposes a training stop point detection method that relies only on training accuracy trends, eliminating the need for clean validation sets or noise ratio knowledge.

Findings

01

AutoTSP effectively finds near-optimal stopping points across various datasets and noise conditions.

02

The method is robust to different noise ratios and types, maintaining high test accuracy.

03

AutoTSP outperforms existing early stopping techniques that require additional data or noise estimates.

Abstract

Training deep neural networks (DNNs) with noisy labels is a challenging problem due to over-parameterization. DNNs tend to essentially fit on clean samples at a higher rate in the initial stages, and later fit on the noisy samples at a relatively lower rate. Thus, with a noisy dataset, the test accuracy increases initially and drops in the later stages. To find an early stopping point at the maximum obtainable test accuracy (MOTA), recent studies assume either that i) a clean validation set is available or ii) the noise ratio is known, or, both. However, often a clean validation set is unavailable, and the noise estimation can be inaccurate. To overcome these issues, we provide a novel training solution, free of these conditions. We analyze the rate of change of the training accuracy for different noise ratios under different conditions to identify a training stop region. We further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsEarly Stopping