Fixed-Budget Differentially Private Best Arm Identification
Zhirui Chen, P. N. Karthik, Yeow Meng Chee, and Vincent Y. F. Tan

TL;DR
This paper introduces a differentially private algorithm for best arm identification in linear bandits within a fixed-budget setting, providing matching upper and lower bounds on error probability that account for privacy constraints.
Contribution
It proposes the DP-BAI policy based on maximum absolute determinants and derives exponential decay bounds on error probability under differential privacy in fixed-budget BAI.
Findings
Error probability bounds decay exponentially with sample size T.
Bounds depend on arm gaps, privacy parameter ε, and problem complexity.
Theoretical results fill a gap in fixed-budget BAI under differential privacy.
Abstract
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval. Given a finite budget and a privacy parameter , the goal is to minimise the error probability in finding the arm with the largest mean after sampling rounds, subject to the constraint that the policy of the decision maker satisfies a certain {\em -differential privacy} (-DP) constraint. We construct a policy satisfying the -DP constraint (called {\sc DP-BAI}) by proposing the principle of {\em maximum absolute determinants}, and derive an upper bound on its error probability. Furthermore, we derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in , with exponents in the…
Peer Reviews
Decision·ICLR 2024 poster
To this end, the paper proposes a policy satisfying $\epsilon$-DP, thus providing an upper bound of the decaying speed of the error probability. The paper also provides an almost-matching lower bound. Empirical evaluation is also provided to show the effectiveness of the algorithm.
Although this is a nice work, I still suggest the paper provide more discussion on the connections between this problem to 1) BAI in the fixed-confidence regime, 2) and generally, MAB under DP. I understand superficially speaking they are different problems, but it is not very clear (to me) whether or not there exist some connections deeper. For example, there might be a simple adaptation of previous algorithms to suit this setting.
This paper introduces the first algorithm to solve the BAI problem within a fixed-budget constraint and under DP guarantees. The paper also offers an extensive theoretical analysis, establishing an upper bound on the error probability for the new algorithm, which is adaptive to the complexity of the problem measured by $H_{BAI}, H_{pri}$. Furthermore, it provides a matching lower bound, demonstrating that the algorithm attains optimal performance in this specific setting.
The paper lacks a detailed comparison with existing non-private fixed-budget BAI works. For example, a natural question is: Does the error probability of DP-BAI converge to that of the state-of-the-art non-private counterpart when $\epsilon \to \infty$? Such an analysis would be valuable in understanding the trade-offs between privacy and performance. Moreover, I personally believe that the writing of this paper, especially in the algorithm description section, could be improved. Currently the
I think the authors provide fundamental analysis of the problem in terms of the upper and lower bound. The main strength of the contribution is demonstrated by proving that the algorithm matches instance optimal bounds. Further, the algorithm idea is new itself and clearly outperforms existing benchmarks (and straightforward adaptations thereof).
I think the paper can be written more intuitively given that it has a lot of parameters. For example, while the algorithm is stated clearly, I am unsure why it works intuitively. What makes the apparent dimension go down for the first few rounds? How does decreasing the span basis vectors of the arm space lead to convergence to the optimal arm.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Privacy-Preserving Technologies in Data
