Near-Optimal Algorithms for Differentially Private Online Learning in a   Stochastic Environment

Bingshan Hu; Zhiming Huang; Nishant A. Mehta; Nidhi Hegde

arXiv:2102.07929·cs.LG·May 31, 2024·1 cites

Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment

Bingshan Hu, Zhiming Huang, Nishant A. Mehta, Nidhi Hegde

PDF

Open Access

TL;DR

This paper develops near-optimal differentially private algorithms for online learning in stochastic environments, achieving tight regret bounds for both bandit and full information feedback settings.

Contribution

It introduces new algorithms with optimal regret bounds for differentially private online learning in stochastic environments, covering both bandit and full information scenarios.

Findings

01

Achieves optimal instance-dependent regret bounds for private bandit algorithms.

02

Establishes lower bounds for private full information learning.

03

Provides algorithms matching lower bounds up to logarithmic factors.

Abstract

In this paper, we study differentially private online learning problems in a stochastic environment under both bandit and full information feedback. For differentially private stochastic bandits, we propose both UCB and Thompson Sampling-based algorithms that are anytime and achieve the optimal $O (\sum_{j : Δ_{j} > 0} \frac{l n ( T )}{m i n { Δ _{j} , ϵ }})$ instance-dependent regret bound, where $T$ is the finite learning horizon, $Δ_{j}$ denotes the suboptimality gap between the optimal arm and a suboptimal arm $j$ , and $ϵ$ is the required privacy parameter. For the differentially private full information setting with stochastic rewards, we show an $Ω (\frac{l n ( K )}{m i n { Δ _{m i n} , ϵ }})$ instance-dependent regret lower bound and an $Ω (T ln (K) + \frac{l n ( K )}{ϵ})$ minimax…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques