Local Differential Privacy for Sequential Decision Making in a Changing Environment
Pratik Gajane

TL;DR
This paper introduces a privacy-preserving algorithm for non-stationary sequential decision making, achieving near-optimal regret bounds while guaranteeing local differential privacy in changing environments.
Contribution
It proposes a novel non-stationary corrupt bandit model, a new algorithm SW-KLUCB-CF with proven near-optimal regret, and a mechanism ensuring local differential privacy with high utility.
Findings
The regret upper bound for SW-KLUCB-CF is near-optimal and matches the best known bounds.
A provably optimal privacy mechanism maintains local differential privacy with high utility.
The approach effectively handles abrupt environmental changes in sequential decision making.
Abstract
We study the problem of preserving privacy while still providing high utility in sequential decision making scenarios in a changing environment. We consider abruptly changing environment: the environment remains constant during periods and it changes at unknown time instants. To formulate this problem, we propose a variant of multi-armed bandits called non-stationary stochastic corrupt bandits. We construct an algorithm called SW-KLUCB-CF and prove an upper bound on its utility using the performance measure of regret. The proven regret upper bound for SW-KLUCB-CF is near-optimal in the number of time steps and matches the best known bound for analogous problems in terms of the number of time steps and the number of changes. Moreover, we present a provably optimal mechanism which can guarantee the desired level of local differential privacy while providing high utility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Bandit Algorithms Research · Age of Information Optimization
