Secure Best Arm Identification in the Presence of a Copycat
Asaf Cohen, Onur G\"unl\"u

TL;DR
This paper introduces a secure algorithm for best arm identification in stochastic linear bandits that balances high identification accuracy with privacy against an observer, using coded arms without cryptography.
Contribution
It proposes a novel secure algorithm using coded arms that maintains high error exponents while keeping the best arm secret, without cryptographic methods.
Findings
Achieves an error exponent of d with coded arms
Reveals almost no information about the best arm
Outperforms naive uniform play in security and efficiency
Abstract
Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with arms of dimension . In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The player's goal is to identify the best arm after arm pulls. Moreover, assume a copycat Chloe is observing the arm pulls. The player wishes to keep Chloe ignorant of the best arm. While a minimax--optimal algorithm identifies the best arm with an error exponent, it easily reveals its best-arm estimate to an outside observer, as the best arms are played more frequently. A naive secure algorithm that plays all arms equally results in an exponent. In this paper, we propose a secure algorithm that plays…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
