Gains and Losses are Fundamentally Different in Regret Minimization: The   Sparse Case

Joon Kwon; Vianney Perchet

arXiv:1511.08405·cs.LG·November 30, 2015·1 cites

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

Joon Kwon, Vianney Perchet

PDF

Open Access

TL;DR

This paper reveals fundamental differences between gains and losses in regret minimization under sparsity, deriving optimal bounds that depend on sparsity size and dimension, with implications for bandit settings.

Contribution

It introduces novel regret bounds for sparse settings, showing gains and losses behave differently, and extends results to bandit scenarios with near-optimal bounds.

Findings

01

Gains have regret bounds of order √(T log s)

02

Losses have regret bounds of order √(Ts log(d)/d)

03

Bandit setting achieves bounds of order √(Ts log(d/s))

Abstract

We demonstrate that, in the classical non-stochastic regret minimization problem with $d$ decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most $s$ decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain an optimal regret guarantee after $T$ stages of order $T lo g s$ , so the classical dependency in the dimension is replaced by the sparsity size. With losses, we provide matching upper and lower bounds of order $T s lo g (d) / d$ , which is decreasing in $d$ . Eventually, we also study the bandit setting, and obtain an upper bound of order $T s lo g (d / s)$ when outcomes are losses. This bound is proven to be optimal up to the logarithmic factor $lo g (d / s)$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Reinforcement Learning in Robotics