Robust classification via MOM minimization

Guillaume Lecu\'e; Matthieu Lerasle; Timoth\'ee Mathieu

arXiv:1808.03106·math.ST·August 10, 2018·Mach. Learn.

Robust classification via MOM minimization

Guillaume Lecu\'e, Matthieu Lerasle, Timoth\'ee Mathieu

PDF

TL;DR

This paper introduces MOM minimizers, a robust classification method based on median-of-means estimators, which are less sensitive to outliers and dataset corruption, and demonstrates their theoretical and empirical effectiveness.

Contribution

It extends Vapnik's ERM framework by incorporating MOM estimators, providing a robust alternative with proven convergence properties and practical advantages in computation.

Findings

01

MOM minimizers achieve Vapnik's slow convergence rates under weak assumptions.

02

The proposed algorithms are robust to dataset corruption and outliers.

03

Empirical results show improved performance and efficiency on simulated and real datasets.

Abstract

We present an extension of Vapnik's classical empirical risk minimizer (ERM) where the empirical risk is replaced by a median-of-means (MOM) estimator, the new estimators are called MOM minimizers. While ERM is sensitive to corruption of the dataset for many classical loss functions used in classification, we show that MOM minimizers behave well in theory, in the sense that it achieves Vapnik's (slow) rates of convergence under weak assumptions: data are only required to have a finite second moment and some outliers may also have corrupted the dataset. We propose an algorithm inspired by MOM minimizers. These algorithms can be analyzed using arguments quite similar to those used for Stochastic Block Gradient descent. As a proof of concept, we show how to modify a proof of consistency for a descent algorithm to prove consistency of its MOM version. As MOM algorithms perform a smart…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.