Online Convex Optimization with Unbounded Memory
Raunak Kumar, Sarah Dean, and Robert Kleinberg

TL;DR
This paper introduces a new framework for online convex optimization that accounts for long-term dependencies on past decisions, providing tight bounds and broad applicability to various online learning problems.
Contribution
It generalizes OCO to include unbounded memory, introduces the $p$-effective memory capacity, and establishes tight regret bounds with new lower bounds for finite memory cases.
Findings
Proves an $O( oot H_p T)$ upper bound on regret.
Establishes a matching lower bound for the regret.
Applies the framework to online linear control and performative prediction.
Abstract
Online convex optimization (OCO) is a widely used framework in online learning. In each round, the learner chooses a decision in a convex set and an adversary chooses a convex loss function, and then the learner suffers the loss associated with their current decision. However, in many applications the learner's loss depends not only on the current decision but on the entire history of decisions until that point. The OCO framework and its existing generalizations do not capture this, and they can only be applied to many settings of interest after a long series of approximation arguments. They also leave open the question of whether the dependence on memory is tight because there are no non-trivial lower bounds. In this work we introduce a generalization of the OCO framework, "Online Convex Optimization with Unbounded Memory", that captures long-term dependence on past decisions. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Advanced Wireless Network Optimization
