Understand Dynamic Regret with Switching Cost for Online Decision Making

Yawei Zhao; Qian Zhao; Xingxing Zhang; En Zhu; Xinwang Liu; Jianping; Yin

arXiv:1911.12595·cs.LG·December 2, 2019·1 cites

Understand Dynamic Regret with Switching Cost for Online Decision Making

Yawei Zhao, Qian Zhao, Xingxing Zhang, En Zhu, Xinwang Liu, Jianping, Yin

PDF

Open Access

TL;DR

This paper explores how switching costs influence dynamic regret in online decision making, revealing that switching costs significantly affect online algorithms but not in online convex optimization, supported by new theoretical analysis.

Contribution

It introduces a new framework analyzing the relation between switching costs and dynamic regret, showing different impacts in OA and OCO settings, and establishes a lower bound in OCO.

Findings

01

Switching costs impact dynamic regret in OA but not in OCO.

02

In OCO, switching costs do not alter the lower bound of regret.

03

Theoretical analysis differentiates the effect of switching costs across online decision frameworks.

Abstract

As a metric to measure the performance of an online method, dynamic regret with switching cost has drawn much attention for online decision making problems. Although the sublinear regret has been provided in many previous researches, we still have little knowledge about the relation between the dynamic regret and the switching cost. In the paper, we investigate the relation for two classic online settings: Online Algorithms (OA) and Online Convex Optimization (OCO). We provide a new theoretical analysis framework, which shows an interesting observation, that is, the relation between the switching cost and the dynamic regret is different for settings of OA and OCO. Specifically, the switching cost has significant impact on the dynamic regret in the setting of OA. But, it does not have an impact on the dynamic regret in the setting of OCO. Furthermore, we provide a lower bound of regret…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics