Better scalability under potentially heavy-tailed feedback

Matthew J. Holland

arXiv:2012.07346·stat.ML·December 15, 2020

Better scalability under potentially heavy-tailed feedback

Matthew J. Holland

PDF

Open Access 1 Repo

TL;DR

This paper introduces scalable robust gradient descent methods that efficiently handle heavy-tailed data by focusing on robust candidate selection, improving scalability and robustness in large-scale learning tasks.

Contribution

It proposes a new approach that replaces costly gradient aggregation with a robust candidate selection process, enhancing scalability and robustness for heavy-tailed data.

Findings

01

Method scales better to large problems.

02

Empirical robustness to heavy-tailed noise.

03

Applicable to various benchmark datasets.

Abstract

We study scalable alternatives to robust gradient descent (RGD) techniques that can be used when the losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly aggregate gradients at each step, which is costly and leads to sub-optimal dimension dependence in risk bounds, we instead focus computational effort on robustly choosing (or newly constructing) a strong candidate based on a collection of cheap stochastic sub-processes which can be run in parallel. The exact selection process depends on the convexity of the underlying objective, but in all cases, our selection technique amounts to a robust form of boosting the confidence of weak learners. In addition to formal guarantees, we also provide empirical analysis of robustness to perturbations to experimental conditions, under both sub-Gaussian and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

feedbackward/sgd-roboost
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques