Secure Best Arm Identification in the Presence of a Copycat

Asaf Cohen; Onur G\"unl\"u

arXiv:2507.18975·cs.LG·July 29, 2025

Secure Best Arm Identification in the Presence of a Copycat

Asaf Cohen, Onur G\"unl\"u

PDF

TL;DR

This paper introduces a secure algorithm for best arm identification in stochastic linear bandits that balances high identification accuracy with privacy against an observer, using coded arms without cryptography.

Contribution

It proposes a novel secure algorithm using coded arms that maintains high error exponents while keeping the best arm secret, without cryptographic methods.

Findings

01

Achieves an error exponent of d with coded arms

02

Reveals almost no information about the best arm

03

Outperforms naive uniform play in security and efficiency

Abstract

Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with $K$ arms of dimension $d$ . In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The player's goal is to identify the best arm after $T$ arm pulls. Moreover, assume a copycat Chloe is observing the arm pulls. The player wishes to keep Chloe ignorant of the best arm. While a minimax--optimal algorithm identifies the best arm with an $Ω (\frac{T}{l o g ( d )})$ error exponent, it easily reveals its best-arm estimate to an outside observer, as the best arms are played more frequently. A naive secure algorithm that plays all arms equally results in an $Ω (\frac{T}{d})$ exponent. In this paper, we propose a secure algorithm that plays…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.