Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative   Weight Update

Zhiyi Huang; Xue Zhu

arXiv:1905.00812·cs.DS·May 3, 2019

Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative Weight Update

Zhiyi Huang, Xue Zhu

PDF

Open Access

TL;DR

This paper introduces a near-optimal, differentially private packing algorithm that improves resource supply requirements and provides theoretical guarantees, along with a linear-time, truthful, online variant.

Contribution

It presents an improved $(eta, eta)$-jointly differentially private packing algorithm with better resource bounds and a matching hardness result, plus a fast, truthful, online approach.

Findings

01

Improves resource supply bounds from $ ilde{O}(m^2 / eta au)$ to $ ilde{O}(rac{ oot{2} ext{m}}{eta au})$

02

Provides a near-matching lower bound on resource requirements for private algorithms

03

Introduces a linear-time, truthful, online private packing algorithm

Abstract

We present an improved $(ϵ, δ)$ -jointly differentially private algorithm for packing problems. Our algorithm gives a feasible output that is approximately optimal up to an $α n$ additive factor as long as the supply of each resource is at least $\tilde{O} (m / α ϵ)$ , where $m$ is the number of resources. This improves the previous result by Hsu et al.~(SODA '16), which requires the total supply to be at least $\tilde{O} (m^{2} / α ϵ)$ , and only guarantees approximate feasibility in terms of total violation. Further, we complement our algorithm with an almost matching hardness result, showing that $Ω (m ln (1/ δ) / α ϵ)$ supply is necessary for any $(ϵ, δ)$ -jointly differentially private algorithm to compute an approximately optimal packing solution. Finally, we introduce an alternative approach that runs…

Figures3

Click any figure to enlarge with its caption.

Figure 2

Tables1

Table 1. Table 1: A comparison of different approaches

	Min supply	Exact feasibility	$ϵ$ -JDP	Online	Exact truthfulness	Running time (in $n$ )
Dual GD [12]	$\tilde{O} (\frac{m}{α ϵ})$ ²²footnotemark: 2	No	No	No	No	$\tilde{O} (n^{3})$
Dual MWU (§3)	$\tilde{O} (\frac{\sqrt{m}}{α ϵ})$	Yes	No	No	No	$\tilde{O} (n^{3})$
Dual online MWU (§5)	$\tilde{O} (\frac{\sqrt{m n}}{α ϵ})$	Yes	Yes³³footnotemark: 3	Yes	Yes	$\tilde{O} (n)$
²²footnotemark: 2 Hsu et al. [12] requires the total supply of all resources to be at least $\tilde{O} (\frac{m^{2}}{α ϵ})$ . We divide it by $m$ and interpret the result as the (average) supply per resource for a direct comparison with the supply requirements in this paper.
³³footnotemark: 3 To get $ϵ$ -JDP, we need a larger supply of at least $\tilde{O} (\frac{m \sqrt{n}}{α ϵ})$ of each resource.

Equations217

Ω (\frac{m l n ( 1/ δ )}{α ϵ}) .

Ω (\frac{m l n ( 1/ δ )}{α ϵ}) .

\sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} π_{ik} x_{ik}

\sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} π_{ik} x_{ik}

\sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} a_{ij k} x_{ik} \leq b_{j}

\sum_{k \in [ℓ_{i}]} x_{ik} \leq 1

x_{ik} \geq 0

\textstyle\max_{x\in X}\min_{p\in[0,\infty)^{m}}~{}\left(\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}\pi_{ik}x_{ik}-\sum_{j\in[m]}p_{j}\big{(}\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ijk}x_{ik}-b_{j}\big{)}\right)

\textstyle\max_{x\in X}\min_{p\in[0,\infty)^{m}}~{}\left(\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}\pi_{ik}x_{ik}-\sum_{j\in[m]}p_{j}\big{(}\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ijk}x_{ik}-b_{j}\big{)}\right)

L (x, p)

L (x, p)

\displaystyle\textstyle=\sum_{j\in[m]}b_{j}p_{j}+\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}x_{ik}\big{(}\pi_{ik}-\sum_{j\in[m]}a_{ijk}p_{j}\big{)}

x^{*}_{ik}(p)=\begin{cases}1,&\mbox{if $\pi_{ik}-\sum_{j=1}^{m}a_{ijk}p_{j}\geq 0$ and $k=\operatorname*{arg\,max}_{k^{\prime}\in[\ell_{i}]}\pi_{ik^{\prime}}-\sum_{j=1}^{m}a_{ijk^{\prime}}p_{j}$;}\\ 0,&\mbox{otherwise.}\end{cases}

x^{*}_{ik}(p)=\begin{cases}1,&\mbox{if $\pi_{ik}-\sum_{j=1}^{m}a_{ijk}p_{j}\geq 0$ and $k=\operatorname*{arg\,max}_{k^{\prime}\in[\ell_{i}]}\pi_{ik^{\prime}}-\sum_{j=1}^{m}a_{ijk^{\prime}}p_{j}$;}\\ 0,&\mbox{otherwise.}\end{cases}

\textstyle\Pr\big{[}\mathcal{M}(D)\in S\big{]}\leq\exp(\epsilon)\cdot\Pr\big{[}\mathcal{M}({D}^{\prime})\in S\big{]}+\delta~{}.

\textstyle\Pr\big{[}\mathcal{M}(D)\in S\big{]}\leq\exp(\epsilon)\cdot\Pr\big{[}\mathcal{M}({D}^{\prime})\in S\big{]}+\delta~{}.

ϵ^{'} = ϵ 2 T ln (1/ δ^{'}) + T ϵ (e^{ϵ} - 1) .

ϵ^{'} = ϵ 2 T ln (1/ δ^{'}) + T ϵ (e^{ϵ} - 1) .

\textstyle\Pr\big{[}\mathcal{M}_{-i}(D)\in S_{-i}\big{]}\leq\exp(\epsilon)\cdot\Pr\big{[}\mathcal{M}_{-i}({D}^{\prime})\in S_{-i}\big{]}+\delta~{}.

\textstyle\Pr\big{[}\mathcal{M}_{-i}(D)\in S_{-i}\big{]}\leq\exp(\epsilon)\cdot\Pr\big{[}\mathcal{M}_{-i}({D}^{\prime})\in S_{-i}\big{]}+\delta~{}.

x^{(t)}_{ik}=\begin{cases}1,&\mbox{if $\pi^{(t)}_{ik}-\sum_{j=1}^{m}a_{ijk}p^{(t)}_{j}\geq 0$ and }\mbox{ $k=\operatorname*{arg\,max}_{k^{\prime}\in[\ell_{i}]}\pi^{(t)}_{ik^{\prime}}-\sum_{j=1}^{m}a_{ijk^{\prime}}p^{(t)}_{j}$;}\\ 0,&\mbox{otherwise.}\end{cases}

x^{(t)}_{ik}=\begin{cases}1,&\mbox{if $\pi^{(t)}_{ik}-\sum_{j=1}^{m}a_{ijk}p^{(t)}_{j}\geq 0$ and }\mbox{ $k=\operatorname*{arg\,max}_{k^{\prime}\in[\ell_{i}]}\pi^{(t)}_{ik^{\prime}}-\sum_{j=1}^{m}a_{ijk^{\prime}}p^{(t)}_{j}$;}\\ 0,&\mbox{otherwise.}\end{cases}

\nabla_{j}{\bar{D}}(p^{(t)})=\begin{cases}\nabla_{j}{\hat{D}}(p^{(t)})&\mbox{if $-\nabla_{\max}\leq\nabla_{j}{\hat{D}}(p^{(t)})\leq\nabla_{\max}$ ~{};}\\ -\nabla_{\max}&\mbox{if $\nabla_{j}{\hat{D}}(p^{(t)})<-\nabla_{\max}$~{};}\\ \nabla_{\max}&\mbox{if $\nabla_{j}{\hat{D}}(p^{(t)})>\nabla_{\max}$~{}.}\end{cases}

\nabla_{j}{\bar{D}}(p^{(t)})=\begin{cases}\nabla_{j}{\hat{D}}(p^{(t)})&\mbox{if $-\nabla_{\max}\leq\nabla_{j}{\hat{D}}(p^{(t)})\leq\nabla_{\max}$ ~{};}\\ -\nabla_{\max}&\mbox{if $\nabla_{j}{\hat{D}}(p^{(t)})<-\nabla_{\max}$~{};}\\ \nabla_{\max}&\mbox{if $\nabla_{j}{\hat{D}}(p^{(t)})>\nabla_{\max}$~{}.}\end{cases}

L (x, p)

L (x, p)

\displaystyle\textstyle=\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}x_{ik}\big{(}\pi_{ik}-\sum_{j\in[m]}a_{ijk}p_{j}\big{)}+\sum_{j\in[m]}b_{j}p_{j}

L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p) = ⟨ p^{(t)} - p, \nabla D (p^{(t)})⟩

L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p) = ⟨ p^{(t)} - p, \nabla D (p^{(t)})⟩

L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p) = ⟨ p^{(t)} - p, \nabla_{p} L (x^{(t)}, p^{(t)})⟩ = ⟨ p^{(t)} - p, \nabla D (p^{(t)})⟩,

L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p) = ⟨ p^{(t)} - p, \nabla_{p} L (x^{(t)}, p^{(t)})⟩ = ⟨ p^{(t)} - p, \nabla D (p^{(t)})⟩,

η \nabla_{m a x} = \frac{ln ( m + 1 )}{α b T} (n + \frac{ln ( T ) 8 T m ln ( 2/ β )}{ϵ}) = \frac{l n ( m + 1 ) 8 m l n ( T ) l n ( 2/ δ )}{α ϵ ^{2} nb} \leq \frac{l n ( m + 1 ) 8 m l n ( T ) l n ( 2/ δ )}{α ϵ ^{2} b ^{2}} .

η \nabla_{m a x} = \frac{ln ( m + 1 )}{α b T} (n + \frac{ln ( T ) 8 T m ln ( 2/ β )}{ϵ}) = \frac{l n ( m + 1 ) 8 m l n ( T ) l n ( 2/ δ )}{α ϵ ^{2} nb} \leq \frac{l n ( m + 1 ) 8 m l n ( T ) l n ( 2/ δ )}{α ϵ ^{2} b ^{2}} .

D_{KL}(p\|p^{(t+1)})-D_{KL}(p\|p^{(t)})\leq-\eta\big{\langle}p^{(t)}-p,\nabla\bar{D}(p^{(t)})\big{\rangle}+\eta^{2}p_{\max}\nabla_{\max}^{2}~{}.

D_{KL}(p\|p^{(t+1)})-D_{KL}(p\|p^{(t)})\leq-\eta\big{\langle}p^{(t)}-p,\nabla\bar{D}(p^{(t)})\big{\rangle}+\eta^{2}p_{\max}\nabla_{\max}^{2}~{}.

\textstyle\Pr\bigg{[}\sum_{t=1}^{T}\langle q^{(t)},\nu^{(t)}\rangle\geq\frac{p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}\bigg{]}\leq\tfrac{\beta}{6}~{},

\textstyle\Pr\bigg{[}\sum_{t=1}^{T}\langle q^{(t)},\nu^{(t)}\rangle\geq\frac{p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}\bigg{]}\leq\tfrac{\beta}{6}~{},

\textstyle\Pr\bigg{[}\sum_{t=1}^{T}\langle q^{(t)},\nu^{(t)}\rangle\leq-\frac{p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}\bigg{]}\leq\tfrac{\beta}{6}~{}.

\textstyle\Pr\bigg{[}\sum_{t=1}^{T}\langle q^{(t)},\nu^{(t)}\rangle\leq-\frac{p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}\bigg{]}\leq\tfrac{\beta}{6}~{}.

\begin{split}\textstyle~{}\sum_{t=1}^{T}\sum_{j=1}^{m}q^{(t)}_{j}\cdot\max\big{\{}0,\nu^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\geq\frac{2p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}\end{split}

\begin{split}\textstyle~{}\sum_{t=1}^{T}\sum_{j=1}^{m}q^{(t)}_{j}\cdot\max\big{\{}0,\nu^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\geq\frac{2p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}\end{split}

\textstyle~{}\sum_{t=1}^{T}\sum_{j=1}^{m}q^{(t)}_{j}\cdot\max\big{\{}0,-\nu^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\geq\frac{2p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}

\textstyle~{}\sum_{t=1}^{T}\sum_{j=1}^{m}q^{(t)}_{j}\cdot\max\big{\{}0,-\nu^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\geq\frac{2p_{\max}\sqrt{8T\ln(\frac{6}{\beta})}}{\epsilon^{\prime}}~{}

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p)) \leq \frac{1}{η} D_{K L} (p ∥ p^{(1)}) + η \cdot T p_{m a x} \nabla_{m a x}^{2} + \frac{20 p _{m a x} T m l n ( \frac{6}{β} ) l n ( \frac{2}{δ} )}{ϵ}

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p)) \leq \frac{1}{η} D_{K L} (p ∥ p^{(1)}) + η \cdot T p_{m a x} \nabla_{m a x}^{2} + \frac{20 p _{m a x} T m l n ( \frac{6}{β} ) l n ( \frac{2}{δ} )}{ϵ}

L (x^{(t)}, p^{(t)}) = \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} x_{ik}^{(t)} (π_{ik} - \sum_{j \in [m]} a_{ij k} p_{j}^{(t)}) + \sum_{j \in [m]} p_{j}^{(t)} b_{j} .

L (x^{(t)}, p^{(t)}) = \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} x_{ik}^{(t)} (π_{ik} - \sum_{j \in [m]} a_{ij k} p_{j}^{(t)}) + \sum_{j \in [m]} p_{j}^{(t)} b_{j} .

L (x^{(t)}, p^{(t)}) \geq \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} x_{ik}^{*} (π_{ik} - \sum_{j \in [m]} a_{ij k} p_{j}^{(t)}) + \sum_{j \in [m]} p_{j}^{(t)} b_{j} .

L (x^{(t)}, p^{(t)}) \geq \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} x_{ik}^{*} (π_{ik} - \sum_{j \in [m]} a_{ij k} p_{j}^{(t)}) + \sum_{j \in [m]} p_{j}^{(t)} b_{j} .

\sum_{j \in [m]} p_{j}^{(t)} (b_{j} - \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} a_{ij k} x_{ik}^{*}) + \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} π_{ik} x_{ik}^{*} .

\sum_{j \in [m]} p_{j}^{(t)} (b_{j} - \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} a_{ij k} x_{ik}^{*}) + \sum_{i \in [n]} \sum_{k \in [ℓ_{i}]} π_{ik} x_{ik}^{*} .

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p))

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p))

= \frac{1}{η} p_{m a x} ln (m + 1) + η \cdot T p_{m a x} (n + \frac{l n ( T )}{ϵ ^{^{'}}})^{2} + \frac{20 p _{m a x} T m l n ( 6/ β ) l n ( 2/ δ )}{ϵ}

= 4 α T n + \frac{T l n ( T ) ^{2} l n ( 2/ δ ) l n ( m + 1 ) mn}{α ϵ ^{2} b ^{2}} + \frac{20 n T m l n ( 6/ β ) l n ( 2/ δ ))}{b ϵ} .

\frac{T l n ( T ) ^{2} l n ( 2/ δ ) l n ( m + 1 ) mn}{α ϵ ^{2} b ^{2}} \leq α T n, and \frac{20 n T m l n ( 6/ β ) l n ( 2/ δ )}{b ϵ} \leq α T n .

\frac{T l n ( T ) ^{2} l n ( 2/ δ ) l n ( m + 1 ) mn}{α ϵ ^{2} b ^{2}} \leq α T n, and \frac{20 n T m l n ( 6/ β ) l n ( 2/ δ )}{b ϵ} \leq α T n .

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p)) \leq 6 α T n .

\sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p)) \leq 6 α T n .

T (O P T - A L G) \leq \sum_{t = 1}^{T} (L (x^{(t)}, p^{(t)}) - L (x^{(t)}, p)) \leq 6 T α n,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplexity and Algorithms in Graphs · Cryptography and Data Security · Privacy-Preserving Technologies in Data

Full text

Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative Weight Update

Zhiyi Huang Department of Computer Science, the University of Hong Kong. This work is supported in by a RGC grant HKU27200214E. {zhiyi,xzhu2}@cs.hku.hk

Xue Zhu††footnotemark:

Abstract

We present an improved $(\epsilon,\delta)$ -jointly differentially private algorithm for packing problems. Our algorithm gives a feasible output that is approximately optimal up to an $\alpha n$ additive factor as long as the supply of each resource is at least $\tilde{O}(\sqrt{m}/\alpha\epsilon)$ , where $m$ is the number of resources. This improves the previous result by Hsu et al. (SODA ’16), which requires the total supply to be at least $\tilde{O}(m^{2}/\alpha\epsilon)$ , and only guarantees approximate feasibility in terms of total violation. Further, we complement our algorithm with an almost matching hardness result, showing that $\Omega(\sqrt{m\ln(1/\delta)}/\alpha\epsilon)$ supply is necessary for any $(\epsilon,\delta)$ -jointly differentially private algorithm to compute an approximately optimal packing solution. Finally, we introduce an alternative approach that runs in linear time, is exactly truthful, can be implemented online, and can be $\epsilon$ -jointly differentially private, but requires a larger supply of each resource.

1 Introduction

Handling user data has become the focal point of modern computational problems, bringing up many new challenges including user privacy. The consensus notion of privacy in theoretical computer science is differential privacy introduced by Dwork et al. [7]. Informally, a mechanism is differentially private if the output distribution is insensitive to the change of an individual’s data. Since its introduction, the community has introduced differentially private algorithms for a variety of problems, including computation of numerical statistics (e.g., [9, 4]), machine learning (e.g., [6, 15], game theory (e.g., [17, 14]), etc.

However, many fundamental combinatorial problems, such as the bipartite matching problem, provably do not admit any differentially private algorithm with non-trivial approximation guarantee [11]. To resolve the situation, Hsu et al. [11] adopt a relaxed notion called joint differential privacy by Kearns et al. [16] and introduce jointly differentially private algorithms for the bipartite matching problem and more generally the welfare maximization problem w.r.t. gross substitutes valuations. Hsu et al. [12] further develop the techniques in [11], and propose a general framework that solves a large family of convex programs in a jointly differentially private manner via a noisy version of the dual gradient descent method.

While our techniques can be applied to the generic convex programs studied in [12], we focus on the packing problem in this conference version due to space constraint. Consider a packing problem with $n$ agents and $m$ resources where each agent has different values for different bundles of resources. The value and resource demands of an agent are private. The goal is to allocate bundles to agents so as to maximize the sum of values of all agents subject to the supply constrains of resources. The algorithm by Hsu et al. [12] incurs an $\tilde{O}(m^{2}/\epsilon)$ additive loss in terms of the objective and up to $\tilde{O}(m^{2}/\epsilon)$ additive total violation of the supply constraints. In other words, their algorithm guarantees up to $\alpha n$ additive loss in the objective if $n\geq\tilde{O}(m^{2}/\alpha\epsilon)$ , and the total violation is at most $\alpha$ fraction of the total supply if the supply per constraint is at least $\tilde{O}(m/\alpha\epsilon)$ . No non-trivial guarantee is given in terms of per constraint violation.

Our contributions. Our first result is an $(\epsilon,\delta)$ -jointly differentially private algorithm (Sec. 3) that improves the results by Hsu et al. [12] by two means: (1) It reduces the supply requirement in terms of the dependence on the number of resources $m$ ; and (2) it provides exact feasibility with high probability. 111We note that Hsu et al. [12] can also obtain exact feasibility by shifting the error to the objective, under the extra assumption that each unit of resource provides at most value $1$ in the objective. Our result does not need such an assumption. Concretely, we show the following:

Theorem 1.1

For any packing problem with $m$ constraints, there is an $(\epsilon,\delta)$ -jointly differentially private algorithm whose output is feasible and approximately optimal up to an $\alpha n$ additive factor as long as the supply of each resource is at least $\tilde{O}(\sqrt{m}/\alpha\epsilon)$ .

Our algorithm is a noisy version of the multiplicative weight update algorithm running in the dual space. To compute a primal packing solution, we maintain a set of dual prices that coordinates the primal decisions: an item is selected in the packing if and only if its value is greater than the total price of its resource demands. The dual prices are then updated using the multiplicative weight update method (e.g., [1]) with Laplacian noise added to each iteration to ensure that the dual prices are differentially private. Finally, the billboard lemma introduced by Hsu et al. [11] indicates that the primal packing solution satisfies joint differential privacy if the coordinators, i.e., the dual prices, satisfy differential privacy.

The main technical challenges then arise from bounding the error introduced by the Laplacian noise. First, the standard analysis of the multiplicative weight update framework requires the update weights to be bounded, while our noisy update weights are unbounded. To resolve this problem, we truncate noisy update weights that are either too large or too small, and bound the error due to this truncation using the small-tail properties of Laplacian distributions (Lemma 3.6). Second, to obtain with-high-probability guarantees, one resorts to concentration bounds, which generally require the random variables to be bounded. To this end, we introduce a concentration lemma for Laplacian distributed variables (Lemma 3.5).

Then, we complement our algorithm with an almost matching lower bound on the minimum supply required for any $(\epsilon,\delta)$ -jointly differentially private algorithm to compute an approximately optimal solution (Sec. 4).

Theorem 1.2

(Hardness for $(\epsilon,\delta)$ -private algorithms) If there is an $(\epsilon,\delta)$ -jointly differentially private algorithm that with high probability outputs a feasible solution for the packing problem that is approximately optimal up to an additive $O(\alpha n)$ , then the supply per constraint must be at least

[TABLE]

This lower bound significantly improves the best previous bound of $\Omega\big{(}{1}/{\sqrt{\alpha}}\big{)}$ by Hsu et al. [11]. It matches the supply requirement in Theorem 1.1 up to log factors, certifying that our algorithm has achieved near optimal trade-offs between privacy and accuracy. The proof of this hardness result draws a novel connection between jointly differentially private packing algorithms and differentially private query release algorithms. We show that one can take an arbitrary jointly differentially private packing algorithm as a blackbox and use it to construct a differentially private query release algorithm for answering arbitrary counting queries. Then, the hardness result follows from existing hardness for query release by Steinke and Ullman [23].

Having achieved the optimal tradeoffs between privacy and accuracy, we turn to other aspects of the algorithm such as running time. A common drawback of the algorithms in the previous dual gradient descent approach by Hsu et al. [12] and that in Theorem 1.1 is the cubic dependence in $n$ in the running time. Is there a privacy-preserving algorithm that goes over each agent’s data only once and still finds an approximately optimal packing solution?

To this end, we introduce an alternative approach that can be interpreted as a privacy-preserving version of the online packing algorithm in the random-arrival model by Agrawal and Devanur [1]. In each round of dual update, instead of computing the best responses of all agents to calculate an accurate dual subgradient, the new approach picks one agent and uses his best response to calculate a proxy subgradient and updates the dual prices accordingly; the agent gets his best response bundle. The algorithm updates the dual prices for exactly $n$ rounds, using each agent to compute the proxy subgradient exactly once. The ordering that the algorithm picks the agent is chosen uniformly at random at the beginning.

The new approach runs in linear time, significantly faster than the other approaches whose running time has cubic dependence in $n$ . However, it needs a larger supply of each resource, i.e., at least $\tilde{O}\big{(}\frac{\sqrt{nm}}{\alpha\epsilon}\big{)}$ , in order to compute an approximately optimal solution. Whether there exists an jointly differentially private algorithm that both runs in linear time and achieves the optimal tradeoffs between privacy and accuracy characterized in Theorem 1.1 and Theorem 1.2 is an interesting open problem.

Further, the new approach can be implemented in the online random-arrival setting, where the agents show up one by one in a random order and the algorithm must decide the allocation to each agent at his arrival. It also achieves exact truthfulness if we charge each agent the dual prices in his round because every agent gets his best response bundle. Previous approach by Hsu et al. [13, 12] gets only approximate truthfulness.

Last but not least, the approach in this section can be implemented in an $\epsilon$ -jointly differentially private manner, provided that the supply of each resource is at least $\tilde{O}\big{(}\frac{m\sqrt{n}}{\alpha\epsilon}\big{)}$ . Neither the dual multiplicative weight update approach in Theorem 1.1 nor the approach in previous work [12] can achieve $\epsilon$ -joint differential privacy, unless the supply $b\geq n$ in which case the problem is trivial.

Theorem 1.3

For any packing problem with $m$ constraints, there is a linear time, truthful, and $(\epsilon,\delta)$ -jointly differentially private algorithm whose output is feasible and approximately optimal up to an $\alpha n$ additive factor as long as the supply of each resource is at least $\tilde{O}(\sqrt{mn}/\alpha\epsilon)$ . Further, the algorithm can work in the online random-arrival model, and can get $\epsilon$ -joint differential privacy if the supply of each resource is at least $\tilde{O}(m\sqrt{n}/\alpha\epsilon)$ .

We present in Table 1 a brief comparison of the dual gradient descent approach by Hsu et al. [12], the dual multiplicative weight update approach in Theorem 1.1, and the dual online multiplicative weight update approach in Theorem 1.3.

Other related work. McSherry and Talwar [17] propose the exponential mechanism as a generic method for designing differentially private algorithms for optimization problems for which the set of feasible outcomes do not depend on user data. As one may expect, such a generic method is not computationally efficient in general. Bassily et al. [3] introduce an efficient implementation of the exponential mechanism when the set of feasible outcomes further forms a compact subset in the Euclidean space and the objective is a convex function. However, these techniques are not directly applicable to our problem as the set of feasible outcomes of packing problems crucially depends on user data. Hsu et al. [13] systematically study what linear programs can be solved in a differentially private manner and what cannot. However, the packing linear program provably cannot be solved differentially privately [11].

Since the introduction of joint differential privacy by Kearns et al. [16], it has found a wide range of applications, including equilibrium selection [20], max flow [21], mechanism design [16], privacy-preserving surveys [10], etc. A common technical ingredient of these work is the billboard lemma introduced by Hsu et al. [11], which also serves as an important building block of the analysis in this paper.

The packing problem has been extensively studied in both the offline (e.g., [19, 22]) and online settings (e.g., [5]). We note that a lot of these work use the primal dual technique. Our work can be viewed as an adoption of these techniques in the privacy preserving context.

2 Preliminaries

2.1 Packing problem and the (partial) dual

Consider a packing problem with $n$ agents and $m$ resources. Let $[\ell]$ denote the set $\{1,2,\dots,\ell\}$ for any positive integer $\ell$ . Each agent $i\in[n]$ demands one of $\ell_{i}$ bundles of resources. If we allocate a bundle $k\in[\ell_{i}]$ to agent $i$ , his valuation will be $\pi_{ik}\in[0,1]$ and an $a_{ijk}\in[0,1]$ amount of resource $j$ will be consumed for every $j\in[m]$ ; if we do not allocate any bundle to agent $i$ , his valuation will be [math] and no resource will be consumed. The parameters associated with an agent $i$ is the private data of the agent. Let $U$ denote the data universe and let $D\in U^{n}$ denote a dataset of $n$ agents. Each resource $j\in[m]$ has supply $b_{j}$ . The goal is then to choose a subset of the items that maximizes the total valuation subject to the supply constraints. This can be formulated as the following packing linear program:

[TABLE]

Let $X_{i}=\{x_{i}\in[0,1]^{\ell_{i}}:\sum_{k\in[\ell_{i}]}x_{ik}\leq 1\}$ denote the set of feasible decisions associated with agent $i$ and $X=X_{1}\times\dots\times X_{n}$ denote the feasible decisions of all agents if we ignore the supply constraints. The partial Lagrangian of the above program is:

[TABLE]

Let $L(x,p)$ denote the above partial Lagrangian objective, that is,

[TABLE]

Let $D(p)=\max_{x\in X}L(x,p)$ denote the dual objective. We shall interpret $p_{j}$ as the unit price of resource $j$ for any $j\in[m]$ . Given a set of prices $p$ , an optimal solution of the optimization problem $\max_{x\in X}L(x,p)$ is:

[TABLE]

Here, we break ties in lexicographical order in the maximization problem of the first case.

The following lemma follows by the Envelope theorem (e.g., [18]).

Lemma 2.1

Given any set of prices $p$ , $\nabla_{p}L\big{(}x^{*}(p),p\big{)}$ is a sub-gradient of $D(p)$ .

2.2 Joint differential privacy

Next, we present necessary preliminaries for differential privacy and joint differential privacy. Two datasets $D,D^{{}^{\prime}}\in U^{n}$ are $i$ -neighbors if they differ only in their $i$ -th entry, that is, $D_{j}=D^{{}^{\prime}}_{j}$ for all $j\neq i$ . The notion of differential privacy by Dwork et al. [7] requires the output distributions to be similar for any neighboring datasets.

Definition 2.1 (Differential privacy [7])

A mechanism $\mathcal{M}:{U}^{n}\to X$ is $(\epsilon,\delta)$ -differentially private if for any $i\in[n]$ , any $i$ -neighbors $D,D^{\prime}\in{U}^{n}$ , and any subset $S\subseteq X$ , we have:

[TABLE]

The Laplace mechanism by Dwork et al. [7] computes numerical statistics of a dataset differentially privately by adding Laplacian distributed noise to the output. We shall use the Laplace mechanism to maintain a sequence of differentially private dual prices.

Definition 2.2 (Laplace mechanism [7])

Suppose $f:{U}^{n}\rightarrow\mathbb{R}$ has sensitivity $\sigma$ , i.e., $|f(D)-f(D^{\prime})|\leq\sigma$ for any neighboring $D$ and $D^{\prime}$ . Given a database $D\in{U}^{n}$ , the Laplace mechanism outputs $f(D)+Z$ , where $Z\sim Lap(\sigma/\epsilon)$ .

Lemma 2.2 ([7])

The Laplace mechanism is $(\epsilon,0)$ -differentially private.

One can combine differentially private subroutines to obtain algorithms for more complicated tasks; the privacy parameter will scale gracefully. This is formalized as the composition theorem:

Lemma 2.3 (Composition Theorem [8])

Suppose $\mathcal{A}$ is a $T$ -fold adaptive composition of $(\epsilon,\delta)$ -differentially private mechanisms. Then, $\mathcal{A}$ satisfies $(\epsilon^{\prime},T\delta+\delta^{\prime})$ -differential privacy for

[TABLE]

As mentioned in the introduction, for many optimization problems including the packing problem considered in this paper, we need to consider a relaxed notion called joint differential privacy proposed by Kearns et al. [16]. Informally, joint differential privacy is defined w.r.t. problems whose outputs are comprised of $n$ components, one for each of the $n$ agents. It relaxes the requirement of differential privacy so that for any $i$ -neighboring datasets, only the output components of agents other than $i$ need to be similarly distributed.

Definition 2.3 (Joint differential privacy [16])

A mechanism $\mathcal{M}:{U}^{n}\to X=X_{1}\times\dots\times X_{n}$ is $(\epsilon,\delta)$ -jointly differentially private if for any $i\in[n]$ , any $i$ -neighbors $D,D^{\prime}\in{U}^{n}$ , and any subset $S_{-i}\subseteq X_{-i}$ , we have:

[TABLE]

The most important connection between joint differential privacy and differential privacy is the following billboard lemma established by Hsu et al. [11]. This lemma has been the cornerstone of many recent work on joint differential privacy and plays a crucial role in this paper as well.

Lemma 2.4 (Billboard Lemma [11])

Suppose $\mathcal{M}:{U}^{n}\rightarrow{Y}^{T}$ is $(\epsilon,\delta)$ -differentially private. Consider any set of functions $f_{i}:{U}\times{Y}^{T}\rightarrow{X}^{\prime}$ . Then the mechanism ${M}^{{}^{\prime}}$ that outputs to each agent $i:f_{i}(D_{i},{M}(D))$ is $(\epsilon,\delta)$ -jointly differentially private.

3 Private dual multiplicative weight algorithm

Our algorithm is a noisy version of the multiplicative weight update algorithm running in the dual space. First, recall the partial Lagrangian objective of the packing problem:

[TABLE]

For simplicity, we assume $b_{j}=b$ for all $j\in[m]$ . It is straightforward to extend our results to general values of $b_{j}$ ’s. Then, it is without loss to assume that $n\geq b$ as otherwise the optimal solution is trivial with $x_{ik^{*}}=1$ where $k^{*}=\operatorname*{arg\,max}_{k\in[\ell_{i}]}\pi_{ik}$ , for all $i$ .

For convenience of discussion, we add a dummy constraint $\langle 0,x\rangle\leq 0$ as the $(m+1)$ -th constraint. As a result, there is a new dual variable $p_{m+1}$ corresponding to the new constraint and, thus, $p$ becomes a $m+1$ dimension vector. We will restrict $p$ such that its $\ell_{1}$ norm equals some appropriately chosen $p_{\max}$ .

The multiplicative weight update algorithm finds a set of dual prices $p$ that approximately minimizes the dual objective $D(p)=\max_{x\in X}L(x,p)$ . In the process, it also finds an approximately optimal primal solution. Concretely, it starts with an initial $p^{(1)}=\tfrac{p_{\max}}{m+1}\cdot\mathbf{1}$ , where $\mathbf{1}$ is the all- $1$ vector. In each round $t$ , it first computes a sub-gradient of the dual objective $\nabla D(p^{(t)})$ using the envelope theorem, which boils down to computing the best response of the agents to the current dual prices. Then, it computes $p^{(t+1)}$ by multiplying each entry of $p^{(t)}$ by $1-\eta\nabla_{j}D(p^{(t)})$ for some appropriately chosen step size $\eta$ , and normalizing it to have $\ell_{1}$ norm $p_{\max}$ . (This is equivalent to a projection back to the simplex $\|p\|_{1}=p_{\max}$ with respect to the Kullback-Leibler divergence.) In order to get joint differential privacy, we use a noisy version of the sub-gradient in our algorithm and show that the error introduced by the Laplacian noise can be bounded. The algorithm is presented as Algorithm 1.

3.1 Proof of Theorem 1.1 (privacy)

The privacy part follows from the next Lemma 3.1 and the Billboard Lemma (Lemma 2.4).

Lemma 3.1

The sequence of duals $p^{(1)},\dots,p^{(T)}$ given by Pri-DMW are $(\epsilon,\delta)$ -differentially private.

Proof.

Note that the sequence of dual price vectors are determined by a sequence of noisy sub-gradients $\nabla\hat{D}(p^{(t)})$ ’s. Hence, it suffices to show the sequence of noisy sub-gradients is $(\epsilon,\delta)$ -differentially private.

Next, observe that the noisy sub-gradient $\nabla_{j}\hat{D}(p^{(t)})$ of each step $t$ is computed by adding Laplace noise of scale $\tfrac{1}{\epsilon^{\prime}}$ to the sub-gradient $\nabla_{j}D(p^{(t)})=b-\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ijk}x_{ik}^{(t)}$ . By our assumption, the sub-gradient has sensitivity $1$ . Hence, given $p^{(t)}$ , the computation of the noisy sub-gradient $\nabla_{j}\hat{D}(p^{(t)})$ satisfies $(\epsilon^{\prime},0)$ -differential privacy (Lemma 2.2). Further, $p^{(t)}$ is determined by the noisy sub-gradients $\nabla_{j}\hat{D}(p^{(1)}),\dots,\nabla_{j}\hat{D}(p^{(t-1)})$ in previous rounds. Hence, the sequence of noisy sub-gradients are computed via an adaptive composition of $Tm$ Laplacian mechanisms each of which is $(\epsilon^{\prime},0)$ -differentially private. The lemma then follows by the composition theorem (Lemma 2.3).

3.2 Proof of Theorem 1.1 (approximation)

In this subsection, we will show that our algorithm violates each constraint by at most $\alpha b$ and is approximately optimal up to an $\alpha n$ additive factor. Then, to get exact feasibility as in Theorem 1.1, we simply run Pri-DMW with $(1-\alpha)b$ as the supply per constraint, noting that doing so decreases the optimal objective by at most an $1-\alpha$ multiplicative factor.

We first introduce some useful facts and technical lemmas.

Lemma 3.2

For any $p$ , we have that:

[TABLE]

Proof.

By the definition of $L(x,p)$ , we have:

[TABLE]

where the second equality is due to the envelope theorem (Lemma 2.1).

Lemma 3.3

If $b\geq\frac{20\ln(T)\sqrt{m\ln(m+1)\ln(6/\beta)\ln(2/\delta)}}{\alpha\epsilon}$ , then we have $\eta\nabla_{\max}<1$ .

Proof.

Plug in the value of $\eta,\nabla_{\max},T$ . We have:

[TABLE]

So if $b\geq\frac{20\ln(T)\sqrt{m\ln(m+1)\ln(6/\beta)\ln(2/\delta)}}{\alpha\epsilon}$ , we have $\eta\nabla_{\max}<1$ .

Lemma 3.4

For any $p$ with $\|p\|_{1}=p_{\max}$ , and $\eta\nabla_{\max}<1$ , we have:

[TABLE]

This lemma follows by the standard analysis of multiplicative weight update. We include the proof in Appendix A for the sake of completeness. The proofs of the next three lemmas are also deferred to Appendix A.

Lemma 3.5

For any $q^{(1)},\dots,q^{(T)}$ such that $\|q^{(t)}\|_{1}=p_{\max}$ and that $q^{(t)}$ depends only on $\nu^{(1)},\dots,\nu^{(t-1)}$ , we have that:

[TABLE]

and

[TABLE]

Lemma 3.6

For any $q^{(1)},\dots,q^{(T)}\geq 0$ such that $\|q^{(t)}\|_{1}\leq 2p_{\max}$ and $\|q^{(t)}\|_{\infty}\leq p_{\max}$ , and that $q^{(t)}$ depends only on $\nu^{(1)},\dots,\nu^{(t-1)}$ for $1\leq t\leq T$ , we have that with probability at most $\beta/3$

[TABLE]

and with probability at most $\beta/3$

[TABLE]

Putting the above pieces together, we have the following key lemma that is useful for showing approximate optimality and feasibility.

Lemma 3.7

For any $p$ such that $\|p\|_{1}=p_{\max}$ , with probability at least $1-\beta$ , we have:

[TABLE]

3.2.1 Approximate optimality

Lemma 3.8

For any $t$ , we have $L(x^{(t)},p^{(t)})\geq OPT$ .

Proof.

Let $x^{*}$ be the optimal primal solution. Recall the definition of the Lagrangian objective. We have that:

[TABLE]

Since that $x^{(t)}$ maximizes $L(x,p^{(t)})$ , we get that:

[TABLE]

Rearranging terms, this is further equal to

[TABLE]

Finally, note that by the definition $x^{*}$ , the first term equals $OPT$ and the second term is greater than or equal to zero. So the lemma follows.

Proof.

(Approximate optimality) Recall that we add a dummy constraint $\langle 0,x\rangle\leq 0$ , we let $p_{m+1}=p_{\max}$ and $p_{j}=0$ for all $j\leq m$ . By Lemma 3.7, the following holds with probability at least $1-\beta$ :

[TABLE]

If $b\geq\frac{20\ln(T)\sqrt{m\ln(m+1)\ln(6/\beta)\ln(2/\delta)}}{\alpha\epsilon}$ , we further bound the 2nd and 3rd terms by

[TABLE]

So we get that

[TABLE]

Note that $\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}\pi_{ik}\bar{x}_{ik}=ALG$ . By our choice of $p$ , we have $\sum_{t=1}^{T}L(x^{(t)},p)=T\cdot ALG$ . By Lemma 3.8, we have $\sum_{t=1}^{T}L(x^{(t)},p^{(t)})\geq T\cdot OPT$ . So we have:

[TABLE]

So we have the desired approximate optimality guarantee.

3.2.2 Approximate feasibility

Proof.

(Approximate feasibility) We choose $p$ to penalize the over-demands and, thus, make $L(x^{(t)},p)$ as small as possible. We let $p_{j^{*}}=p_{\max}$ , where $j^{*}$ is the most over-demanded constraint and let $p_{j}=0$ for any $j\neq j^{*}$ . Let $s=b-\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ij^{*}k}\bar{x}_{ik}$ be the over-demand of $j^{*}$ . By Lemma 3.7 and the choice of $p$ and $p^{(1)}$ , with probability at least $1-\beta$ , we have:

[TABLE]

By the choice of $p$ , we further get that:

[TABLE]

Putting together with Lemma 3.8, the LHS of (LABEL:eq7) is at least $T(OPT-ALG+p_{\max}s)$ .

Also note that $ALG\leq(1+\frac{s}{b})OPT$ , because increasing the supply per resource from $b$ to $b+s$ increases the optimal packing objective by at most a $\frac{b+s}{b}$ factor. So we have:

[TABLE]

So we have that:

[TABLE]

Plug in the choice of parameters, we get that

[TABLE]

Therefore, the max violation per constraint is at most $s\leq O(\alpha b)$ as long as the supply per constraint is at least $b\geq\frac{20\ln(T)\sqrt{m\ln(m+1)\ln(6/\beta)\ln(2/\delta)}}{\alpha\epsilon}$ .

4 Hardness (Theorem 1.2)

Theorem 4.1 (Theorem 1.1 of [23])

Suppose there is an $(\epsilon,\delta)$ -differentially private algorithm for answering $m$ arbitrary counting queries on a dataset of size $b$ with average error at most $O(\alpha b)$ . Then, we have

[TABLE]

Lemma 4.1

Suppose for some $b$ and $m$ , there is an $(\epsilon,\delta)$ -jointly differentially private algorithm that with high probability outputs a feasible solution that is optimal up to an $\alpha n$ additive factor for $n=\Theta(b)$ . Then, there is an $(\epsilon,\delta)$ -differentially private algorithm for answering $m$ arbitrary counting queries on any dataset of size $\Theta(b)$ with average error at most $O(\alpha b)$ .

Proof.

Consider an arbitrary dataset of size $n^{\prime}=\frac{b}{2}$ , denoted as $D^{\prime}=\{d_{1},\dots,d_{n^{\prime}}\}$ , and an arbitrary set of $m$ counting queries of sensitivity $1$ , denoted as $\mathcal{Q}=\{q_{1},\dots,q_{m}\}$ . Construct an instance of the packing problem with $n=\Theta(b)$ agents as follows:

Let there be a set $n^{\prime}$ agents, denoted as $A$ , each of which demands a unique bundle, i.e., $|[\ell_{i}]|=1$ and therefore we omit subscript $k$ in the following. The resource demanded in the bundle is $a_{ij}=q_{j}(d_{i})$ and the value is $\pi_{i}=1$ .

Further, let there be $2b$ agents, denoted as $B$ , each of whom demands any subset of size $\frac{m}{2}$ and has value $\frac{1}{4}$ . That is, $\ell_{i}={m\choose{m}/{2}}$ ; for any subset $S\subseteq[m]$ of size $\frac{m}{2}$ , let there be a $k$ such that $a_{ijk}=1$ if $j\in S$ and [math] otherwise, and $\pi_{ik}=\frac{1}{4}$ .

Note that by allocating a bundle to one of the agents in $A$ , we get value at least $\frac{1}{m}$ per unit of resources. On the other hand, allocating a bundle to one of the agents in $B$ gets at only $\frac{1}{2m}$ value per unit of resources.

Lemma 4.2

$OPT\geq n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}-\frac{1}{2}$ .

Proof.

We will prove by constructing a feasible solution with total value lower bounded by the RHS of the inequality. First, we will allocate to all agents in $A$ their desired bundles. We gain $n^{\prime}$ total value by doing so, and has a remaining supply $b-q_{j}(D^{\prime})$ of resource $j$ for any $j\in[m]$ .

Then, we claim that it is possible to allocate bundles to a subset of the agents in $B$ such that we use up all but at most $1$ unit of every resource. Given that, the lemma follows because allocating bundles to agents in $B$ gives precisely $\frac{1}{2m}$ value per unit of resources.

In the rest of the proof, we will explain how to allocate bundles to agents in $B$ . Note that after allocating bundles to agents in $A$ , the maximum demand of any resource is at most $n^{\prime}=\frac{b}{2}$ , while the minimum demand of some resource could be [math].

We will inductively decide how to allocate bundles to agents in $B$ in $\frac{b}{2}-1$ rounds such that after round $i$ , $0\leq i\leq\frac{b}{2}-1$ , the maximum demand of any resource will be at most $n^{\prime}+i$ , while the minimum demand of any resource will be at least $2i$ . Then, at the end of the process, the maximum demand will be at most $b-1$ and the minimum demand will be at least $b-2$ . We simply allocate to two more agents such that the first one gets resources $1$ to $\frac{m}{2}$ and the second one gets resources $\frac{m}{2}+1$ to $m$ . That is, we further use one unit of each resource.

The claim is vacuously true for $i=0$ . Next, suppose we have finished the first $i-1$ rounds for some $i\geq 1$ . Let us explain how to allocate bundles in round $i$ .

Suppose the maximum and minimum demands of resources differs by at most $1$ . We simply repeatedly allocate bundles to two agents in set $B$ so that one unit of each resource is allocated to exactly one of the two agents until the maximum demand equals $n^{\prime}+i$ .

Otherwise, suppose the maximum and minimum demands, denoted as $d^{+}$ and $d^{-}$ respectively, differ by at least $2$ . We will further divide it into three cases depending on the numbers of resources with demands $d^{+}$ and $d^{-}$ respectively, denoted as $k^{+}$ and $k^{-}$ respectively. Let us assume w.l.o.g. that the resources are sorted in ascending order of their current demands. E.g., resources $1$ to $k^{-}$ are those with demands equal $d^{-}$ , and resource $m-k^{+}+1$ to $m$ are those with demands equal $d^{+}$ .

The first case is when there are $k^{-}\leq\frac{m}{2}$ resources with demand $d^{-}$ and $k^{+}>k^{-}$ . In this case, consider two agents in $B$ . Let us allocate items $1$ to $\frac{m}{2}$ to the first agent, and allocate items $1$ to $k^{-}$ together with items $\frac{m}{2}+1$ to $m-k^{-}$ to the second one. Note that $m-k^{-}>m-k^{+}$ by our assumption on $k^{-}$ and $k^{+}$ . We have increased (1) the demands of resources $1$ to $k^{-}$ by $2$ (i.e., from $k^{-}$ to $k^{-}+2\leq k^{+}$ ), (2) the demands of resources $k^{-}+1$ to $m-k^{+}$ by $1$ (i.e., from $<k^{+}$ to at most $k^{+}$ ), and (3) demands of a subset of the resources $m-k^{+}+1$ to $m$ by $1$ (i.e., from $k^{+}$ to $k^{+}+1$ ). Thus, we have achieved the desired goal in round $i$ .

The second case is when $k^{-}\leq\frac{m}{2}$ and $k^{+}\leq k^{-}$ . We consider the same two agents as in the previous case. After allocating to those two agents, we have increased (1) the demands of resources $1$ to $k^{-}$ by $2$ (i.e., from $k^{-}$ to $k^{-}+2\leq k^{+}$ ), and (2) the demands of a subset of the resources $k^{-}+1$ to $m-k^{+}$ by $1$ (i.e., from $<k^{+}$ to at most $k^{+}$ ). Then, we further allocate to two agents in set $B$ so that one unit of each resource is allocated to exactly one of the two agents. Then, we have increased (1) the demands of resources $1$ to $k^{-}$ by $3$ (i.e., from $k^{-}$ to at most $k^{-}+3\leq k^{+}+1$ ), (2) the demands of resources $k^{-}+1$ to $m-k^{+}$ by either $1$ or $2$ (i.e., from $<k^{+}$ to at most $k^{+}+1$ ), and (3) the demands of resources $m-k^{+}+1$ to $m$ by $1$ (i.e., from $k^{+}$ to at most $k^{+}+1$ ). Thus, we have achieved the desired goal in round $i$ .

The final case is when $k^{-}>\frac{m}{2}$ . In this case, consider allocating to $4$ agents. The first and second agents get resources $1$ to $\frac{m}{2}$ ; the third agent gets resources $\frac{m}{2}+1$ to $m$ ; and the fourth agent gets resources $k^{-}-\frac{m}{2}+1$ to $k^{-}$ . Then, we have increased (1) the demands of resources $1$ to $k^{-}$ by either $2$ or $3$ (i.e., from $k^{-}$ to at most $k^{-}+3\leq k^{+}+1$ ), (2) the demands of all other resources by $1$ (i.e., from $\leq k^{+}$ to at most $k^{+}+1$ ). Thus, we have achieved the desired goal in round $i$ .

Lemma 4.3

Any solution that is optimal up to an $\alpha n$ additive factor must allocate bundles to all but at most $2\alpha n+1$ agents in $A$ .

Proof.

We will prove the lemma even for fractional solutions. As the solution is allowed to be fractional and there are plenty of agents in $B$ , we can assume without loss that all resources are fully allocated. If we allocate to all agents in $A$ , the objective would be $n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}$ . Note that the value per unit of resource of allocating to agents in set $B$ is at most a half of that of allocating to agents in set $A$ . Thus, for each agent $i\in A$ that remains unallocated in the solution, the objective decreases by at least $\frac{1}{2}$ even if we fully allocate the resources that were allocated to the $i$ to some other agents in $B$ . Putting together with Lemma 4.2 proves the lemma.

Lemma 4.4

Any solution that is optimal up to an $\alpha n$ additive factor must allocate all but at most $O(\alpha bm)$ units of the resources.

Proof.

Again, we will prove the lemma even for fractional solutions. If all resources are fully allocated and we optimally allocate to all agents in $A$ , the objective is $n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}$ . Since the value per unit of resources is at least $\frac{1}{2m}$ for any agent, putting together with Lemma 4.2 proves the lemma.

Now we are ready to introduce the reduction from differentially private query release to jointly differentially private packing. By solving the constructed packing instance in an $(\epsilon,\delta)$ -jointly differentially private manner, the allocation for agents in $B$ is $(\epsilon,\delta)$ -differentially private w.r.t. the data of agents in $A$ according to the definition of joint differential privacy. Then, we can output $b$ minus the number of units of resource $j$ allocated to agents in $B$ , denoted as $\tilde{q}_{j}$ , as the response for query $q_{j}$ . Since this is a post-processing on the output of an $(\epsilon,\delta)$ -differentially private algorithm, the responses are $(\epsilon,\delta)$ -differentially private as well.

It remains to analyze the accuracy of the responses. On one hand, $\tilde{q}_{j}$ is greater than or equal to the number of units of resource $j$ allocated to agents in $A$ , which, by Lemma 4.3 is at least $q_{j}(D^{\prime})-O(\alpha n)=q_{j}(D^{\prime})-O(\alpha b)$ . On the other hand, $\tilde{q}_{j}$ is at most the number of units of resource $j$ allocated to agents in $A$ plus the number of unallocated units of resource $j$ . The former is at most $q_{j}(D^{\prime})$ while the latter is at most $O(\alpha b)$ on average according to Lemma 4.4. Putting together $\tilde{q}_{j}$ ’s have average error at most $\alpha$ .

5 Private dual online multiplicative weight algorithm

In this section we introduce an alternative algorithm for solving the packing problem in a jointly differentially private manner. This alternative approach is similar to the previous one, with the following differences. In each step, instead of computing the best responses of all agents for the current dual prices and, thus, compute the corresponding subgradient, we simply pick one of the agents and use his best response to compute a proxy subgradient. The agent then gets the bundle specified by his best response. We will choose a random ordering of the agents at the beginning and pick agents in that order. As a result, the algorithm will update dual prices for only $n$ rounds as oppose to $\frac{\epsilon^{2}n^{2}}{m}$ rounds in the previous approach.

There are both pros and cons of this alternative approach.

•

The main disadvantage is that it requires a much larger supply to get the same approximation guarantees. Intuitively, this is because (1) we use proxy subgradients in place of the actual subgradients and, thus, introduce some extra error, and (2) it goes over the each agent only once and, thus, does not optimize the number of rounds of dual updates.

•

For the same two reasons that cause the above drawback, the approach in this section has the advantage that we get incentive compatibility for free if we charge the agent the corresponding dual prices in his round, since every agent gets the best response bundle.

•

Further, it can be implemented in the online random-arrival setting, where the agents show up one by one in a random order and the algorithm must decide the allocation to each agent at his arrival.

•

Last but not least, the approach in this section can be implemented in an $\epsilon$ -jointly differentially private manner. Neither the dual multiplicative weight update approach in Section 3 nor the approach in previous work [12] can achieve $\epsilon$ -joint differential privacy.

5.1 Proof of Theorem 1.3 (privacy)

By standard privacy properties of the Laplace mechanism and the composition theorem, the noisy demand vectors $z_{t}$ ’s are $\epsilon$ -differentially private if the noise scale is $\sigma=\frac{m}{\epsilon}$ , and are $(\epsilon,\delta)$ -differentially private if the noise scale is $\sigma=\frac{1}{\epsilon}\sqrt{8m\ln({1}/{\delta})}$ . Then, the joint differential privacy of Algorithm 2 follows by the Billboard Lemma (Lemma 2.4).

5.2 Proof of Theorem 1.3 (approximation)

5.2.1 Key lemma

We will first establish a key lemma that is an analogue of Lemma 3.7 in the previous section.

Lemma 5.1

For any given $p$ such that $\|p\|_{1}=p_{\max}$ , with high probability, we have:

[TABLE]

The proof of Lemma 5.1 follows by a sequence of technical lemmas as follows.

Lemma 5.2

For any $p$ with $\|p\|_{1}=p_{\max}$ , and $\eta\nabla_{\max}<1$ , we have:

[TABLE]

We will omit the proof of the above lemma because it is essentially the same as that of Lemma 3.4, replacing $-\nabla\bar{D}(p^{(t)})$ with $z_{t}-\frac{b}{n}\cdot\mathbf{1}$ .

Next, we decompose the Lagrangian objective $L(x,p)$ into the sum of $n$ components $L_{i}(x_{i},p)$ ’s, $i\in[n]$ , as follows:

[TABLE]

Then, we have:

[TABLE]

Lemma 5.3

For any $p$ with $\|p\|_{1}=p_{\max}$ , and $\eta\nabla_{\max}<1$ , any fixed permutation $\lambda$ and over the randomness of the Laplacian noise, we have that with high probability:

[TABLE]

Proof.

Recall that $z_{t}=y_{t}+\nu_{t}$ . By Eqn. (5.3) and Lemma 5.2, we have:

[TABLE]

Summing over $t\in[n]$ , we get that

[TABLE]

Since $\lambda$ is a permutation, we have that $\sum_{t\in[n]}L_{\lambda(t)}(x_{\lambda(t)},p)=\sum_{t\in[n]}L_{t}(x_{t},p)=L(x,p)$ . Further, recall that our choice of $\nabla_{\max}=\tilde{O}(\sigma)$ . So the RHS further equals

[TABLE]

Finally, let us consider the randomness of the Laplacian noise. We will use the concentration bound for martingales [2] to bound the last term. By Lemma 3.5, the last term on the RHS is at most $\tilde{O}(\sqrt{n}p_{\max}\sigma)$ with high probability. Rearrange terms and the lemma follows.

Lemma 5.4

With high probability, we have:

[TABLE]

Proof.

We will proceed in two steps. Firstly we will show that the expectation of the LHS is at least $OPT-\tilde{O}(\sqrt{n}p_{\max})$ . Then, we will use the standard concentration bound for martingales to bound the deviation of the LHS from its expectation.

For any $t$ , let us fix the randomness in the first $t-1$ rounds and, thus, fix $p^{(t)}$ . Taking expectation over only the randomness of round $t$ , we get that:

[TABLE]

Let $x^{*}$ be the offline optimal primal solution. Since $x^{*}_{i}(p^{(t)})$ is the best response to $p^{(t)}$ , we have:

[TABLE]

Next, consider the difference between the above quantity and the actual average over $n$ agents, i.e., $\frac{1}{n}\sum_{i\in[n]}L_{i}(x_{i}^{*},p^{(t)})$ . We have:

[TABLE]

Note that $[n]\setminus\lambda(1:t-1)$ is a random subset of $n-t+1$ elements in $[n]$ . By the standard concentration bound for sampling without replacement [1], with high probability over the randomness of $\lambda(1:t-1)$ , we have that

[TABLE]

and for any $j\in[m]$

[TABLE]

Putting together, we have:

[TABLE]

where the last inequality follows by the optimality of $x^{*}$ .

Summing over $t\in[n]$ , noting that $\sum_{t\in[n]}\frac{1}{\sqrt{n-t+1}}=\sum_{t\in[n]}\frac{1}{\sqrt{t}}=O(\sqrt{n})$ , we have:

[TABLE]

Then, consider a sequence of random variables $u_{t}\in[-2p_{\max},2p_{\max}]$ ’s as follows:

[TABLE]

Note that $\operatorname{{E}}[u_{t}|\lambda(1:t-1)]=0$ , so $\sum u_{t}$ is a martingale so we have that with high probability

[TABLE]

Summing over $t\in[n]$ together with Eqn. (5.4) proves the lemma.

Putting together Lemma 5.3 and Lemma 5.4 proves Lemma 5.1.

5.2.2 Approximate optimality

Let $p_{m+1}=p_{max}$ and $p_{j}=0$ for all $j\leq m$ . By Lemma 5.1, the following holds with high probability:

[TABLE]

Plug in our choice of parameters, the RHS further equals

[TABLE]

So we have the desired approximate optimality guarantee.

5.2.3 Approximate feasibility

We choose $p$ to penalize the over-demands and, thus, make $L(x^{(t)},p)$ as small as possible. We let $p_{j^{*}}=p_{\max}$ , where $j^{*}$ is the most over-demanded constraint and let $p_{j}=0$ for any $j\neq j^{*}$ . Let $s=b-\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ij^{*}k}x_{ik}$ be the over-demand of $j^{*}$ . By Lemma 5.1 and the choice of $p$ and $p^{(1)}$ , with high probability we have:

[TABLE]

By the choice of $p$ , we further get that:

[TABLE]

Note that $ALG\leq(1+\frac{s}{b})OPT$ , because increasing the supply per resource from $b$ to $b+s$ increases the optimal packing objective by at most a $\frac{b+s}{b}$ factor. So we have:

[TABLE]

where the last inequality is due to $p_{\max}=\tilde{O}(\frac{\alpha\sqrt{n}}{\sigma})$ , $b\geq\tilde{O}\big{(}\frac{\sqrt{n}\sigma}{\alpha}\big{)}$ , and $OPT\leq n$ . So we have that:

[TABLE]

So we have $s\leq\tilde{O}(\sqrt{n}\sigma)$ . Recall that $b\geq\tilde{O}\big{(}\frac{\sqrt{n}\sigma}{\alpha}\big{)}$ , we have $s\leq\alpha b$ .

A Missing proofs in Section 3

A.1 Proof of Lemma 3.4

Proof.

By the definition of KL-divergence, we get that:

[TABLE]

where the last equality is due to $\|p\|_{1}=p_{\max}$ . Next, we bound the two terms separately. The first term equals:

[TABLE]

By the definition of $\nabla\bar{D}(p^{(t)})$ , we have that

[TABLE]

and similarly $\tfrac{\eta}{p_{\max}}\big{\langle}p^{(t)},\nabla\bar{D}(p^{(t)})\big{\rangle}\geq-\tfrac{1}{2}$ . Note that $\ln(1-x)\leq-x$ for any $x\leq 1$ , we have:

[TABLE]

Now we bound the second term using inequalities $\ln(\frac{1}{1-xy})\leq\ln(\frac{1}{1-x})y$ for any $0\leq x,y\leq 1$ and $\ln(\frac{1}{1-xy})\leq\ln(1+x)y$ for any $0\leq x\leq 1,-1\leq y\leq 0$ .

[TABLE]

Then, we further upper bound the above using $\ln(\frac{1}{1-x})\leq x+x^{2}$ and $\ln(1+x)\geq x-x^{2}$ , and get that:

[TABLE]

Finally, by that $\|p\|_{1}\leq p_{\max}$ , and $\nabla_{j}\bar{D}(p^{(t)})\leq\nabla_{\max}$ for any $j$ , $\big{\langle}p,|\nabla\bar{D}(p^{(t)})|\big{\rangle}$ is upper bounded by $p_{\max}\nabla_{\max}$ . So we have:

[TABLE]

Putting (A.1), (A.2), and (A.3) together proves the lemma.

A.2 Proof of Lemma 3.5

Proof.

In our case, $v^{(t)}_{j}$ ’s are unbounded but have exponentially small tail contributions. We follow the standard strategy of proving Azuma-Hoeffding type of concentration bounds. By symmetry of the random variables $v^{(t)}_{j}$ ’s, it suffices to show the first inequality.

Let $X_{k}=\sum_{t=0}^{k}\big{\langle}q^{(t)},v^{(t)}\big{\rangle}$ for $0\leq k\leq T$ . For a positive $\lambda<\frac{\epsilon^{\prime}}{p_{\max}}$ whose value will be determined later, we have:

[TABLE]

Next, we upper bound $\operatorname{{E}}\big{[}\exp\big{(}\lambda\big{(}X_{T}-X_{0}\big{)}\big{)}\big{]}$ , which can be rewritten as:

[TABLE]

For any $1\leq t\leq T$ , we have that:

[TABLE]

Further, for any $1\leq j\leq m$ , we have:

[TABLE]

The last second inequality holds for any $\lambda\leq\frac{\epsilon^{{}^{\prime}}}{2q_{j}^{(t)}}$ . Hence, we have:

[TABLE]

Next, note that $\sum_{j=1}^{m}(q^{(t)}_{j})^{2}\leq\big{(}\sum_{j=1}^{m}q^{(t)}_{j}\big{)}^{2}=p_{\max}^{2}$ . We have:

[TABLE]

and, thus,

[TABLE]

Thus, we get that:

[TABLE]

The lemma then follows by choosing

[TABLE]

A.3 Proof of Lemma 3.6

Proof.

By symmetry, it suffices to show the first inequality by symmetry of the random variables $v^{(t)}_{j}$ ’s. For a positive $\lambda$ to be determined later, we have:

[TABLE]

Next, we bound $E\left[~{}\exp{\left(\lambda\left(q^{(t)}_{j}\cdot\max\big{\{}0,v^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\right)\right)}\right]$ by:

[TABLE]

where the second last inequality is due to $\lambda q^{(t)}_{j}<\lambda p_{\max}<\tfrac{\epsilon^{\prime}}{2}$ . Thus, we have:

[TABLE]

So we have:

[TABLE]

and, thus,

[TABLE]

which equals $\frac{\beta}{3}$ due to our choice of $\lambda=\frac{\epsilon^{\prime}}{2p_{\max}}$ .

A.4 Proof of Lemma 3.7

Proof.

By Lemma 3.2, we have:

[TABLE]

Then, applying 3.4, we further get that:

[TABLE]

Putting together, we have:

[TABLE]

Next, we bound the last two terms separately. By the definition of $\nabla\hat{D}(p^{(t)})$ , the last term can be rewritten as:

[TABLE]

Applying Lemma 3.5 twice, we get that with probability at least $1-\tfrac{\beta}{3}$ ,

[TABLE]

As for the second last term, we can upper bound it as:

[TABLE]

Note that by the definition of $\nabla_{j}\bar{D}(p^{(t)})$ , $\big{|}\nabla_{j}\hat{D}(p^{(t)})-\nabla_{j}\bar{D}(p^{(t)})\big{|}$ can be rewritten as:

[TABLE]

Further, $\nabla_{j}D(p^{(t)})=b-\sum_{i\in[n]}\sum_{k\in[\ell_{i}]}a_{ij}x_{ik}^{(t)}\in[-n,b]\subseteq[-n,n]$ . So the above is at most (recall that $\nabla_{\max}=n+\tfrac{\ln(T)}{\epsilon^{\prime}}$ ):

[TABLE]

Applying Lemma 3.6 twice, we get that with probability at least $1-\tfrac{2\beta}{3}$ ,

[TABLE]

Putting together (A.4), (A.5), and (LABEL:eq:6) proves the lemma.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Shipra Agrawal and Nikhil R. Devanur. Fast algorithms for online stochastic convex programming. In SODA , pages 1405–1424. SIAM, 2015.
2[2] Kazuoki Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, Second Series , 19(3):357–367, 1967.
3[3] Raef Bassily, Adam D. Smith, and Abhradeep Thakurta. Private empirical risk minimization: Efficient algorithms and tight error bounds. In FOCS , pages 464–473. IEEE Computer Society, 2014.
4[4] Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to noninteractive database privacy. J. ACM , 60(2):12:1–12:25, 2013.
5[5] Niv Buchbinder and Joseph Naor. Online primal-dual algorithms for covering and packing. Mathematics of Operations Research , 34(2):270–286, 2009.
6[6] Kamalika Chaudhuri and Daniel J. Hsu. Sample complexity bounds for differentially private learning. In COLT , volume 19 of JMLR Proceedings , pages 155–186. JMLR.org, 2011.
7[7] Cynthia Dwork, Frank Mc Sherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In TCC , volume 3876 of Lecture Notes in Computer Science , pages 265–284. Springer, 2006.
8[8] Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy. In FOCS , pages 51–60. IEEE Computer Society, 2010.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative Weight Update

Abstract

1 Introduction

Theorem 1.1

Theorem 1.2

Theorem 1.3

2 Preliminaries

2.1 Packing problem and the (partial) dual

Lemma 2.1

2.2 Joint differential privacy

Definition 2.1** (Differential privacy [7])**

Definition 2.2** (Laplace mechanism [7])**

Lemma 2.2** ([7])**

Lemma 2.3** (Composition Theorem [8])**

Definition 2.3** (Joint differential privacy [16])**

Lemma 2.4** (Billboard Lemma [11])**

3 Private dual multiplicative weight algorithm

3.1 Proof of Theorem 1.1 (privacy)

Lemma 3.1

3.2 Proof of Theorem 1.1 (approximation)

Lemma 3.2

Lemma 3.3

Lemma 3.4

Lemma 3.5

Lemma 3.6

Lemma 3.7

3.2.1 Approximate optimality

Lemma 3.8

3.2.2 Approximate feasibility

4 Hardness (Theorem 1.2)

Theorem 4.1** (Theorem 1.1 of [23])**

Lemma 4.1

Lemma 4.2

Lemma 4.3

Lemma 4.4

5 Private dual online multiplicative weight algorithm

5.1 Proof of Theorem 1.3 (privacy)

5.2 Proof of Theorem 1.3 (approximation)

5.2.1 Key lemma

Lemma 5.1

Lemma 5.2

Lemma 5.3

Lemma 5.4

5.2.2 Approximate optimality

5.2.3 Approximate feasibility

A Missing proofs in Section 3

A.1 Proof of Lemma 3.4

A.2 Proof of Lemma 3.5

A.3 Proof of Lemma 3.6

A.4 Proof of Lemma 3.7

Definition 2.1 (Differential privacy [7])

Definition 2.2 (Laplace mechanism [7])

Lemma 2.2 ([7])

Lemma 2.3 (Composition Theorem [8])

Definition 2.3 (Joint differential privacy [16])

Lemma 2.4 (Billboard Lemma [11])

Theorem 4.1 (Theorem 1.1 of [23])