Attribute noise robust binary classification

Aditya Petety; Sandhya Tripathi; N Hemachandra

arXiv:1911.07875·cs.LG·November 20, 2019·1 cites

Attribute noise robust binary classification

Aditya Petety, Sandhya Tripathi, N Hemachandra

PDF

Open Access

TL;DR

This paper investigates the robustness of different loss functions for binary classification with noisy features, showing squared loss's robustness in certain models and empirical support for its effectiveness.

Contribution

It introduces theoretical analysis of loss function robustness under attribute noise models and compares the practical performance of squared loss versus 0-1 loss.

Findings

01

Squared loss is robust under Sy-De attribute noise model.

02

0-1 loss is robust in 2D under Asy-In attribute noise but is computationally intractable.

03

Empirical results support squared loss robustness at low to moderate noise levels.

Abstract

We consider the problem of learning linear classifiers when both features and labels are binary. In addition, the features are noisy, i.e., they could be flipped with an unknown probability. In Sy-De attribute noise model, where all features could be noisy together with same probability, we show that $0$ - $1$ loss ( $l_{0 - 1}$ ) need not be robust but a popular surrogate, squared loss ( $l_{s q}$ ) is. In Asy-In attribute noise model, we prove that $l_{0 - 1}$ is robust for any distribution over 2 dimensional feature space. However, due to computational intractability of $l_{0 - 1}$ , we resort to $l_{s q}$ and observe that it need not be Asy-In noise robust. Our empirical results support Sy-De robustness of squared loss for low to moderate noise rates.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques