# Task-Driven Data Verification via Gradient Descent

**Authors:** Siavash Golkar, Kyunghyun Cho

arXiv: 1905.05843 · 2019-05-16

## TL;DR

This paper presents CDGD, a gradient descent-based algorithm for detecting corrupted or mislabeled samples in training datasets using a small clean validation set, applicable across supervised learning tasks.

## Contribution

Introduction of a novel gradient descent-based method for data verification that identifies corrupted samples using a small clean validation set.

## Key findings

- Effective detection of mislabeled samples demonstrated on synthetic datasets.
- Method outperforms baseline approaches in identifying corrupted data.
- Applicable to various supervised learning tasks beyond classification.

## Abstract

We introduce a novel algorithm for the detection of possible sample corruption such as mislabeled samples in a training dataset given a small clean validation set. We use a set of inclusion variables which determine whether or not any element of the noisy training set should be included in the training of a network. We compute these inclusion variables by optimizing the performance of the network on the clean validation set via "gradient descent on gradient descent" based learning. The inclusion variables as well as the network trained in such a way form the basis of our methods, which we call Corruption Detection via Gradient Descent (CDGD). This algorithm can be applied to any supervised machine learning task and is not limited to classification problems. We provide a quantitative comparison of these methods on synthetic and real world datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.05843/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1905.05843/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1905.05843/full.md

---
Source: https://tomesphere.com/paper/1905.05843