Online Scalarization in Vector-Valued Games
Ehsan Asadollahi, Calvin Hawkins, Matthew Hale

TL;DR
This paper introduces an online scalarization approach for vector-valued games, enabling adaptive decision-making that improves convergence to preferred equilibria through a bi-level learning framework.
Contribution
It proposes a novel bi-level learning framework with adaptive scalarization, along with bandit online mirror descent algorithms and finite-time regret guarantees.
Findings
Convergence to preferred equilibrium increased from 50% to 80% with the proposed method.
Finite-time sublinear regret bounds are established for the algorithms.
Experiments demonstrate improved performance over non-adaptive scalarization.
Abstract
We study repeated multi-player vector-valued games in which a player observes a payoff vector each round and evaluates outcomes through linear scalarizations of those vectors. Different from most prior works, the choice of scalarization is treated as an online decision variable rather than a fixed modeling decision. We propose a bi-level learning framework in which an outer learner chooses a scalarization from a finite candidate class on a slow timescale, while a faster inner bandit no-regret learner selects actions using the scalar feedback induced by the chosen scalarization. Performance of this approach is defined with respect to a certain true weight vector, and the deployed scalarizations act as control signals that shape the induced payoff trajectory. We provide implementable algorithms based on bandit online mirror descent with stabilized importance weighting, and we derive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
