A Model-Based Method for Minimizing CVaR and Beyond

Si Yi Meng; Robert M. Gower

arXiv:2305.17498·math.OC·May 30, 2023·1 cites

A Model-Based Method for Minimizing CVaR and Beyond

Si Yi Meng, Robert M. Gower

PDF

Open Access 1 Video

TL;DR

This paper introduces a stochastic prox-linear method tailored for minimizing CVaR in machine learning, offering better structure exploitation, adaptive scaling, and wider step size choices compared to traditional stochastic subgradient methods.

Contribution

The paper presents a novel SPL+ algorithm for CVaR minimization that improves upon SGM by exploiting problem structure and allowing more flexible step sizes.

Findings

01

SPL+ outperforms SGM in convergence speed.

02

The method adapts well to loss function scaling.

03

Experimental results confirm theoretical advantages.

Abstract

We develop a variant of the stochastic prox-linear method for minimizing the Conditional Value-at-Risk (CVaR) objective. CVaR is a risk measure focused on minimizing worst-case performance, defined as the average of the top quantile of the losses. In machine learning, such a risk measure is useful to train more robust models. Although the stochastic subgradient method (SGM) is a natural choice for minimizing the CVaR objective, we show that our stochastic prox-linear (SPL+) algorithm can better exploit the structure of the objective, while still providing a convenient closed form update. Our SPL+ method also adapts to the scaling of the loss function, which allows for easier tuning. We then specialize a general convergence theorem for SPL+ to our setting, and show that it allows for a wider selection of step sizes compared to SGM. We support this theoretical finding experimentally.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Model-Based Method for Minimizing CVaR and Beyond· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Explainable Artificial Intelligence (XAI) · Statistical Methods and Inference