MaxHedge: Maximising a Maximum Online
Stephen Pasteris, Fabio Vitale, Kevin Chan, Shiqiang Wang, Mark, Herbster

TL;DR
MaxHedge introduces an online learning framework where a learner selects action subsets under energy constraints to maximize profit, balancing rewards and costs in a changing environment.
Contribution
It presents a novel online learning model with energy constraints and a general, efficient algorithm adaptable to various combinatorial problems.
Findings
Framework effectively balances reward maximization and cost minimization.
Algorithm adapts to changing rewards and costs over time.
Applicable to multiple online combinatorial problems.
Abstract
We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner's selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms
