Improved scalability under heavy tails, without strong convexity

Matthew J. Holland

arXiv:2006.01364·stat.ML·December 16, 2020·1 cites

Improved scalability under heavy tails, without strong convexity

Matthew J. Holland

PDF

Open Access

TL;DR

This paper introduces a scalable, robust algorithm for machine learning that effectively handles heavy-tailed data without relying on strong convexity, improving dimension dependence and providing transparent guarantees.

Contribution

It presents a simple robust validation sub-routine that enhances gradient-based methods for heavy-tailed data, avoiding expensive robustification steps and strong convexity assumptions.

Findings

01

Improved dimension dependence in risk bounds and computational cost.

02

The proposed method outperforms naive cross-validation under heavy tails.

03

Provides transparent guarantees for heavy-tailed data scenarios.

Abstract

Real-world data is laden with outlying values. The challenge for machine learning is that the learner typically has no prior knowledge of whether the feedback it receives (losses, gradients, etc.) will be heavy-tailed or not. In this work, we study a simple algorithmic strategy that can be leveraged when both losses and gradients can be heavy-tailed. The core technique introduces a simple robust validation sub-routine, which is used to boost the confidence of inexpensive gradient-based sub-processes. Compared with recent robust gradient descent methods from the literature, dimension dependence (both risk bounds and cost) is substantially improved, without relying upon strong convexity or expensive per-step robustification. Empirically, we also show that under heavy-tailed losses, the proposed procedure cannot simply be replaced with naive cross-validation. Taken together, we have a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques