Invariant Lipschitz Bandits: A Side Observation Approach

Nam Phuong Tran; Long Tran-Thanh

arXiv:2212.07524·cs.LG·August 29, 2023

Invariant Lipschitz Bandits: A Side Observation Approach

Nam Phuong Tran, Long Tran-Thanh

PDF

Open Access

TL;DR

This paper introduces a new algorithm for invariant Lipschitz bandits that leverages symmetries to improve regret bounds, bridging a gap in online optimization with symmetry considerations.

Contribution

It proposes the exttt{UniformMesh-N} algorithm that incorporates side observations from group symmetries, providing improved regret bounds for invariant Lipschitz bandits.

Findings

01

Proves an upper regret bound depending on the group size.

02

Establishes a matching lower regret bound up to logarithmic factors.

03

Demonstrates the effectiveness of symmetry exploitation in online bandit problems.

Abstract

Symmetry arises in many optimization and decision-making problems, and has attracted considerable attention from the optimization community: By utilizing the existence of such symmetries, the process of searching for optimal solutions can be improved significantly. Despite its success in (offline) optimization, the utilization of symmetries has not been well examined within the online optimization settings, especially in the bandit literature. As such, in this paper we study the invariant Lipschitz bandit setting, a subclass of the Lipschitz bandits where the reward function and the set of arms are preserved under a group of transformations. We introduce an algorithm named \texttt{UniformMesh-N}, which naturally integrates side observations using group orbits into the \texttt{UniformMesh} algorithm (\cite{Kleinberg2005_UniformMesh}), which uniformly discretizes the set of arms. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Misinformation and Its Impacts