Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative Weight Update
Zhiyi Huang, Xue Zhu

TL;DR
This paper introduces a near-optimal, differentially private packing algorithm that improves resource supply requirements and provides theoretical guarantees, along with a linear-time, truthful, online variant.
Contribution
It presents an improved $(eta, eta)$-jointly differentially private packing algorithm with better resource bounds and a matching hardness result, plus a fast, truthful, online approach.
Findings
Improves resource supply bounds from $ ilde{O}(m^2 / eta au)$ to $ ilde{O}(rac{ oot{2} ext{m}}{eta au})$
Provides a near-matching lower bound on resource requirements for private algorithms
Introduces a linear-time, truthful, online private packing algorithm
Abstract
We present an improved -jointly differentially private algorithm for packing problems. Our algorithm gives a feasible output that is approximately optimal up to an additive factor as long as the supply of each resource is at least , where is the number of resources. This improves the previous result by Hsu et al.~(SODA '16), which requires the total supply to be at least , and only guarantees approximate feasibility in terms of total violation. Further, we complement our algorithm with an almost matching hardness result, showing that supply is necessary for any -jointly differentially private algorithm to compute an approximately optimal packing solution. Finally, we introduce an alternative approach that runs…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3| Min supply | Exact feasibility | -JDP | Online | Exact truthfulness | Running time (in ) | |
| Dual GD [12] | 22footnotemark: 2 | No | No | No | No | |
| Dual MWU (§3) | Yes | No | No | No | ||
| Dual online MWU (§5) | Yes | Yes33footnotemark: 3 | Yes | Yes | ||
| 22footnotemark: 2 Hsu et al. [12] requires the total supply of all resources to be at least . We divide it by and interpret the result as the (average) supply per resource for a direct comparison with the supply requirements in this paper. | ||||||
| 33footnotemark: 3 To get -JDP, we need a larger supply of at least of each resource. | ||||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Cryptography and Data Security · Privacy-Preserving Technologies in Data
Near Optimal Jointly Private Packing Algorithms via Dual Multiplicative Weight Update
Zhiyi Huang Department of Computer Science, the University of Hong Kong. This work is supported in by a RGC grant HKU27200214E. {zhiyi,xzhu2}@cs.hku.hk
Xue Zhu††footnotemark:
Abstract
We present an improved -jointly differentially private algorithm for packing problems. Our algorithm gives a feasible output that is approximately optimal up to an additive factor as long as the supply of each resource is at least , where is the number of resources. This improves the previous result by Hsu et al. (SODA ’16), which requires the total supply to be at least , and only guarantees approximate feasibility in terms of total violation. Further, we complement our algorithm with an almost matching hardness result, showing that supply is necessary for any -jointly differentially private algorithm to compute an approximately optimal packing solution. Finally, we introduce an alternative approach that runs in linear time, is exactly truthful, can be implemented online, and can be -jointly differentially private, but requires a larger supply of each resource.
1 Introduction
Handling user data has become the focal point of modern computational problems, bringing up many new challenges including user privacy. The consensus notion of privacy in theoretical computer science is differential privacy introduced by Dwork et al. [7]. Informally, a mechanism is differentially private if the output distribution is insensitive to the change of an individual’s data. Since its introduction, the community has introduced differentially private algorithms for a variety of problems, including computation of numerical statistics (e.g., [9, 4]), machine learning (e.g., [6, 15], game theory (e.g., [17, 14]), etc.
However, many fundamental combinatorial problems, such as the bipartite matching problem, provably do not admit any differentially private algorithm with non-trivial approximation guarantee [11]. To resolve the situation, Hsu et al. [11] adopt a relaxed notion called joint differential privacy by Kearns et al. [16] and introduce jointly differentially private algorithms for the bipartite matching problem and more generally the welfare maximization problem w.r.t. gross substitutes valuations. Hsu et al. [12] further develop the techniques in [11], and propose a general framework that solves a large family of convex programs in a jointly differentially private manner via a noisy version of the dual gradient descent method.
While our techniques can be applied to the generic convex programs studied in [12], we focus on the packing problem in this conference version due to space constraint. Consider a packing problem with agents and resources where each agent has different values for different bundles of resources. The value and resource demands of an agent are private. The goal is to allocate bundles to agents so as to maximize the sum of values of all agents subject to the supply constrains of resources. The algorithm by Hsu et al. [12] incurs an additive loss in terms of the objective and up to additive total violation of the supply constraints. In other words, their algorithm guarantees up to additive loss in the objective if , and the total violation is at most fraction of the total supply if the supply per constraint is at least . No non-trivial guarantee is given in terms of per constraint violation.
Our contributions. Our first result is an -jointly differentially private algorithm (Sec. 3) that improves the results by Hsu et al. [12] by two means: (1) It reduces the supply requirement in terms of the dependence on the number of resources ; and (2) it provides exact feasibility with high probability. 111We note that Hsu et al. [12] can also obtain exact feasibility by shifting the error to the objective, under the extra assumption that each unit of resource provides at most value in the objective. Our result does not need such an assumption. Concretely, we show the following:
Theorem 1.1
For any packing problem with constraints, there is an -jointly differentially private algorithm whose output is feasible and approximately optimal up to an additive factor as long as the supply of each resource is at least .
Our algorithm is a noisy version of the multiplicative weight update algorithm running in the dual space. To compute a primal packing solution, we maintain a set of dual prices that coordinates the primal decisions: an item is selected in the packing if and only if its value is greater than the total price of its resource demands. The dual prices are then updated using the multiplicative weight update method (e.g., [1]) with Laplacian noise added to each iteration to ensure that the dual prices are differentially private. Finally, the billboard lemma introduced by Hsu et al. [11] indicates that the primal packing solution satisfies joint differential privacy if the coordinators, i.e., the dual prices, satisfy differential privacy.
The main technical challenges then arise from bounding the error introduced by the Laplacian noise. First, the standard analysis of the multiplicative weight update framework requires the update weights to be bounded, while our noisy update weights are unbounded. To resolve this problem, we truncate noisy update weights that are either too large or too small, and bound the error due to this truncation using the small-tail properties of Laplacian distributions (Lemma 3.6). Second, to obtain with-high-probability guarantees, one resorts to concentration bounds, which generally require the random variables to be bounded. To this end, we introduce a concentration lemma for Laplacian distributed variables (Lemma 3.5).
Then, we complement our algorithm with an almost matching lower bound on the minimum supply required for any -jointly differentially private algorithm to compute an approximately optimal solution (Sec. 4).
Theorem 1.2
(Hardness for -private algorithms) If there is an -jointly differentially private algorithm that with high probability outputs a feasible solution for the packing problem that is approximately optimal up to an additive , then the supply per constraint must be at least
[TABLE]
This lower bound significantly improves the best previous bound of \Omega\big{(}{1}/{\sqrt{\alpha}}\big{)} by Hsu et al. [11]. It matches the supply requirement in Theorem 1.1 up to log factors, certifying that our algorithm has achieved near optimal trade-offs between privacy and accuracy. The proof of this hardness result draws a novel connection between jointly differentially private packing algorithms and differentially private query release algorithms. We show that one can take an arbitrary jointly differentially private packing algorithm as a blackbox and use it to construct a differentially private query release algorithm for answering arbitrary counting queries. Then, the hardness result follows from existing hardness for query release by Steinke and Ullman [23].
Having achieved the optimal tradeoffs between privacy and accuracy, we turn to other aspects of the algorithm such as running time. A common drawback of the algorithms in the previous dual gradient descent approach by Hsu et al. [12] and that in Theorem 1.1 is the cubic dependence in in the running time. Is there a privacy-preserving algorithm that goes over each agent’s data only once and still finds an approximately optimal packing solution?
To this end, we introduce an alternative approach that can be interpreted as a privacy-preserving version of the online packing algorithm in the random-arrival model by Agrawal and Devanur [1]. In each round of dual update, instead of computing the best responses of all agents to calculate an accurate dual subgradient, the new approach picks one agent and uses his best response to calculate a proxy subgradient and updates the dual prices accordingly; the agent gets his best response bundle. The algorithm updates the dual prices for exactly rounds, using each agent to compute the proxy subgradient exactly once. The ordering that the algorithm picks the agent is chosen uniformly at random at the beginning.
The new approach runs in linear time, significantly faster than the other approaches whose running time has cubic dependence in . However, it needs a larger supply of each resource, i.e., at least \tilde{O}\big{(}\frac{\sqrt{nm}}{\alpha\epsilon}\big{)}, in order to compute an approximately optimal solution. Whether there exists an jointly differentially private algorithm that both runs in linear time and achieves the optimal tradeoffs between privacy and accuracy characterized in Theorem 1.1 and Theorem 1.2 is an interesting open problem.
Further, the new approach can be implemented in the online random-arrival setting, where the agents show up one by one in a random order and the algorithm must decide the allocation to each agent at his arrival. It also achieves exact truthfulness if we charge each agent the dual prices in his round because every agent gets his best response bundle. Previous approach by Hsu et al. [13, 12] gets only approximate truthfulness.
Last but not least, the approach in this section can be implemented in an -jointly differentially private manner, provided that the supply of each resource is at least \tilde{O}\big{(}\frac{m\sqrt{n}}{\alpha\epsilon}\big{)}. Neither the dual multiplicative weight update approach in Theorem 1.1 nor the approach in previous work [12] can achieve -joint differential privacy, unless the supply in which case the problem is trivial.
Theorem 1.3
For any packing problem with constraints, there is a linear time, truthful, and -jointly differentially private algorithm whose output is feasible and approximately optimal up to an additive factor as long as the supply of each resource is at least . Further, the algorithm can work in the online random-arrival model, and can get -joint differential privacy if the supply of each resource is at least .
We present in Table 1 a brief comparison of the dual gradient descent approach by Hsu et al. [12], the dual multiplicative weight update approach in Theorem 1.1, and the dual online multiplicative weight update approach in Theorem 1.3.
Other related work. McSherry and Talwar [17] propose the exponential mechanism as a generic method for designing differentially private algorithms for optimization problems for which the set of feasible outcomes do not depend on user data. As one may expect, such a generic method is not computationally efficient in general. Bassily et al. [3] introduce an efficient implementation of the exponential mechanism when the set of feasible outcomes further forms a compact subset in the Euclidean space and the objective is a convex function. However, these techniques are not directly applicable to our problem as the set of feasible outcomes of packing problems crucially depends on user data. Hsu et al. [13] systematically study what linear programs can be solved in a differentially private manner and what cannot. However, the packing linear program provably cannot be solved differentially privately [11].
Since the introduction of joint differential privacy by Kearns et al. [16], it has found a wide range of applications, including equilibrium selection [20], max flow [21], mechanism design [16], privacy-preserving surveys [10], etc. A common technical ingredient of these work is the billboard lemma introduced by Hsu et al. [11], which also serves as an important building block of the analysis in this paper.
The packing problem has been extensively studied in both the offline (e.g., [19, 22]) and online settings (e.g., [5]). We note that a lot of these work use the primal dual technique. Our work can be viewed as an adoption of these techniques in the privacy preserving context.
2 Preliminaries
2.1 Packing problem and the (partial) dual
Consider a packing problem with agents and resources. Let denote the set for any positive integer . Each agent demands one of bundles of resources. If we allocate a bundle to agent , his valuation will be and an amount of resource will be consumed for every ; if we do not allocate any bundle to agent , his valuation will be [math] and no resource will be consumed. The parameters associated with an agent is the private data of the agent. Let denote the data universe and let denote a dataset of agents. Each resource has supply . The goal is then to choose a subset of the items that maximizes the total valuation subject to the supply constraints. This can be formulated as the following packing linear program:
[TABLE]
Let denote the set of feasible decisions associated with agent and denote the feasible decisions of all agents if we ignore the supply constraints. The partial Lagrangian of the above program is:
[TABLE]
Let denote the above partial Lagrangian objective, that is,
[TABLE]
Let denote the dual objective. We shall interpret as the unit price of resource for any . Given a set of prices , an optimal solution of the optimization problem is:
[TABLE]
Here, we break ties in lexicographical order in the maximization problem of the first case.
The following lemma follows by the Envelope theorem (e.g., [18]).
Lemma 2.1
Given any set of prices , \nabla_{p}L\big{(}x^{*}(p),p\big{)} is a sub-gradient of .
2.2 Joint differential privacy
Next, we present necessary preliminaries for differential privacy and joint differential privacy. Two datasets are -neighbors if they differ only in their -th entry, that is, for all . The notion of differential privacy by Dwork et al. [7] requires the output distributions to be similar for any neighboring datasets.
Definition 2.1** (Differential privacy [7])**
A mechanism is -differentially private if for any , any -neighbors , and any subset , we have:
[TABLE]
The Laplace mechanism by Dwork et al. [7] computes numerical statistics of a dataset differentially privately by adding Laplacian distributed noise to the output. We shall use the Laplace mechanism to maintain a sequence of differentially private dual prices.
Definition 2.2** (Laplace mechanism [7])**
Suppose has sensitivity , i.e., for any neighboring and . Given a database , the Laplace mechanism outputs , where .
Lemma 2.2** ([7])**
The Laplace mechanism is -differentially private.
One can combine differentially private subroutines to obtain algorithms for more complicated tasks; the privacy parameter will scale gracefully. This is formalized as the composition theorem:
Lemma 2.3** (Composition Theorem [8])**
Suppose is a -fold adaptive composition of -differentially private mechanisms. Then, satisfies -differential privacy for
[TABLE]
As mentioned in the introduction, for many optimization problems including the packing problem considered in this paper, we need to consider a relaxed notion called joint differential privacy proposed by Kearns et al. [16]. Informally, joint differential privacy is defined w.r.t. problems whose outputs are comprised of components, one for each of the agents. It relaxes the requirement of differential privacy so that for any -neighboring datasets, only the output components of agents other than need to be similarly distributed.
Definition 2.3** (Joint differential privacy [16])**
A mechanism is -jointly differentially private if for any , any -neighbors , and any subset , we have:
[TABLE]
The most important connection between joint differential privacy and differential privacy is the following billboard lemma established by Hsu et al. [11]. This lemma has been the cornerstone of many recent work on joint differential privacy and plays a crucial role in this paper as well.
Lemma 2.4** (Billboard Lemma [11])**
Suppose is -differentially private. Consider any set of functions . Then the mechanism that outputs to each agent is -jointly differentially private.
3 Private dual multiplicative weight algorithm
Our algorithm is a noisy version of the multiplicative weight update algorithm running in the dual space. First, recall the partial Lagrangian objective of the packing problem:
[TABLE]
For simplicity, we assume for all . It is straightforward to extend our results to general values of ’s. Then, it is without loss to assume that as otherwise the optimal solution is trivial with where , for all .
For convenience of discussion, we add a dummy constraint as the -th constraint. As a result, there is a new dual variable corresponding to the new constraint and, thus, becomes a dimension vector. We will restrict such that its norm equals some appropriately chosen .
The multiplicative weight update algorithm finds a set of dual prices that approximately minimizes the dual objective . In the process, it also finds an approximately optimal primal solution. Concretely, it starts with an initial , where is the all- vector. In each round , it first computes a sub-gradient of the dual objective using the envelope theorem, which boils down to computing the best response of the agents to the current dual prices. Then, it computes by multiplying each entry of by for some appropriately chosen step size , and normalizing it to have norm . (This is equivalent to a projection back to the simplex with respect to the Kullback-Leibler divergence.) In order to get joint differential privacy, we use a noisy version of the sub-gradient in our algorithm and show that the error introduced by the Laplacian noise can be bounded. The algorithm is presented as Algorithm 1.
3.1 Proof of Theorem 1.1 (privacy)
The privacy part follows from the next Lemma 3.1 and the Billboard Lemma (Lemma 2.4).
Lemma 3.1
The sequence of duals given by Pri-DMW are -differentially private.
- Proof.
Note that the sequence of dual price vectors are determined by a sequence of noisy sub-gradients ’s. Hence, it suffices to show the sequence of noisy sub-gradients is -differentially private.
Next, observe that the noisy sub-gradient of each step is computed by adding Laplace noise of scale to the sub-gradient . By our assumption, the sub-gradient has sensitivity . Hence, given , the computation of the noisy sub-gradient satisfies -differential privacy (Lemma 2.2). Further, is determined by the noisy sub-gradients in previous rounds. Hence, the sequence of noisy sub-gradients are computed via an adaptive composition of Laplacian mechanisms each of which is -differentially private. The lemma then follows by the composition theorem (Lemma 2.3).
3.2 Proof of Theorem 1.1 (approximation)
In this subsection, we will show that our algorithm violates each constraint by at most and is approximately optimal up to an additive factor. Then, to get exact feasibility as in Theorem 1.1, we simply run Pri-DMW with as the supply per constraint, noting that doing so decreases the optimal objective by at most an multiplicative factor.
We first introduce some useful facts and technical lemmas.
Lemma 3.2
For any , we have that:
[TABLE]
- Proof.
By the definition of , we have:
[TABLE]
where the second equality is due to the envelope theorem (Lemma 2.1).
Lemma 3.3
If , then we have .
- Proof.
Plug in the value of . We have:
[TABLE]
So if , we have .
Lemma 3.4
For any with , and , we have:
[TABLE]
This lemma follows by the standard analysis of multiplicative weight update. We include the proof in Appendix A for the sake of completeness. The proofs of the next three lemmas are also deferred to Appendix A.
Lemma 3.5
For any such that and that depends only on , we have that:
[TABLE]
and
[TABLE]
Lemma 3.6
For any such that and , and that depends only on for , we have that with probability at most
[TABLE]
and with probability at most
[TABLE]
Putting the above pieces together, we have the following key lemma that is useful for showing approximate optimality and feasibility.
Lemma 3.7
For any such that , with probability at least , we have:
[TABLE]
3.2.1 Approximate optimality
Lemma 3.8
For any , we have .
- Proof.
Let be the optimal primal solution. Recall the definition of the Lagrangian objective. We have that:
[TABLE]
Since that maximizes , we get that:
[TABLE]
Rearranging terms, this is further equal to
[TABLE]
Finally, note that by the definition , the first term equals and the second term is greater than or equal to zero. So the lemma follows.
- Proof.
(Approximate optimality) Recall that we add a dummy constraint , we let and for all . By Lemma 3.7, the following holds with probability at least :
[TABLE]
If , we further bound the 2nd and 3rd terms by
[TABLE]
So we get that
[TABLE]
Note that . By our choice of , we have . By Lemma 3.8, we have . So we have:
[TABLE]
So we have the desired approximate optimality guarantee.
3.2.2 Approximate feasibility
- Proof.
(Approximate feasibility) We choose to penalize the over-demands and, thus, make as small as possible. We let , where is the most over-demanded constraint and let for any . Let be the over-demand of . By Lemma 3.7 and the choice of and , with probability at least , we have:
[TABLE]
By the choice of , we further get that:
[TABLE]
Putting together with Lemma 3.8, the LHS of (LABEL:eq7) is at least .
Also note that , because increasing the supply per resource from to increases the optimal packing objective by at most a factor. So we have:
[TABLE]
So we have that:
[TABLE]
Plug in the choice of parameters, we get that
[TABLE]
Therefore, the max violation per constraint is at most as long as the supply per constraint is at least .
4 Hardness (Theorem 1.2)
Theorem 4.1** (Theorem 1.1 of [23])**
Suppose there is an -differentially private algorithm for answering arbitrary counting queries on a dataset of size with average error at most . Then, we have
[TABLE]
Lemma 4.1
Suppose for some and , there is an -jointly differentially private algorithm that with high probability outputs a feasible solution that is optimal up to an additive factor for . Then, there is an -differentially private algorithm for answering arbitrary counting queries on any dataset of size with average error at most .
- Proof.
Consider an arbitrary dataset of size , denoted as , and an arbitrary set of counting queries of sensitivity , denoted as . Construct an instance of the packing problem with agents as follows:
Let there be a set agents, denoted as , each of which demands a unique bundle, i.e., and therefore we omit subscript in the following. The resource demanded in the bundle is and the value is .
Further, let there be agents, denoted as , each of whom demands any subset of size and has value . That is, ; for any subset of size , let there be a such that if and [math] otherwise, and .
Note that by allocating a bundle to one of the agents in , we get value at least per unit of resources. On the other hand, allocating a bundle to one of the agents in gets at only value per unit of resources.
Lemma 4.2
OPT\geq n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}-\frac{1}{2}.
- Proof.
We will prove by constructing a feasible solution with total value lower bounded by the RHS of the inequality. First, we will allocate to all agents in their desired bundles. We gain total value by doing so, and has a remaining supply of resource for any .
Then, we claim that it is possible to allocate bundles to a subset of the agents in such that we use up all but at most unit of every resource. Given that, the lemma follows because allocating bundles to agents in gives precisely value per unit of resources.
In the rest of the proof, we will explain how to allocate bundles to agents in . Note that after allocating bundles to agents in , the maximum demand of any resource is at most , while the minimum demand of some resource could be [math].
We will inductively decide how to allocate bundles to agents in in rounds such that after round , , the maximum demand of any resource will be at most , while the minimum demand of any resource will be at least . Then, at the end of the process, the maximum demand will be at most and the minimum demand will be at least . We simply allocate to two more agents such that the first one gets resources to and the second one gets resources to . That is, we further use one unit of each resource.
The claim is vacuously true for . Next, suppose we have finished the first rounds for some . Let us explain how to allocate bundles in round .
Suppose the maximum and minimum demands of resources differs by at most . We simply repeatedly allocate bundles to two agents in set so that one unit of each resource is allocated to exactly one of the two agents until the maximum demand equals .
Otherwise, suppose the maximum and minimum demands, denoted as and respectively, differ by at least . We will further divide it into three cases depending on the numbers of resources with demands and respectively, denoted as and respectively. Let us assume w.l.o.g. that the resources are sorted in ascending order of their current demands. E.g., resources to are those with demands equal , and resource to are those with demands equal .
The first case is when there are resources with demand and . In this case, consider two agents in . Let us allocate items to to the first agent, and allocate items to together with items to to the second one. Note that by our assumption on and . We have increased (1) the demands of resources to by (i.e., from to ), (2) the demands of resources to by (i.e., from to at most ), and (3) demands of a subset of the resources to by (i.e., from to ). Thus, we have achieved the desired goal in round .
The second case is when and . We consider the same two agents as in the previous case. After allocating to those two agents, we have increased (1) the demands of resources to by (i.e., from to ), and (2) the demands of a subset of the resources to by (i.e., from to at most ). Then, we further allocate to two agents in set so that one unit of each resource is allocated to exactly one of the two agents. Then, we have increased (1) the demands of resources to by (i.e., from to at most ), (2) the demands of resources to by either or (i.e., from to at most ), and (3) the demands of resources to by (i.e., from to at most ). Thus, we have achieved the desired goal in round .
The final case is when . In this case, consider allocating to agents. The first and second agents get resources to ; the third agent gets resources to ; and the fourth agent gets resources to . Then, we have increased (1) the demands of resources to by either or (i.e., from to at most ), (2) the demands of all other resources by (i.e., from to at most ). Thus, we have achieved the desired goal in round .
Lemma 4.3
Any solution that is optimal up to an additive factor must allocate bundles to all but at most agents in .
- Proof.
We will prove the lemma even for fractional solutions. As the solution is allowed to be fractional and there are plenty of agents in , we can assume without loss that all resources are fully allocated. If we allocate to all agents in , the objective would be n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}. Note that the value per unit of resource of allocating to agents in set is at most a half of that of allocating to agents in set . Thus, for each agent that remains unallocated in the solution, the objective decreases by at least even if we fully allocate the resources that were allocated to the to some other agents in . Putting together with Lemma 4.2 proves the lemma.
Lemma 4.4
Any solution that is optimal up to an additive factor must allocate all but at most units of the resources.
- Proof.
Again, we will prove the lemma even for fractional solutions. If all resources are fully allocated and we optimally allocate to all agents in , the objective is n^{\prime}+\frac{1}{2m}\sum_{j\in[m]}\big{(}b-q_{j}(D^{\prime})\big{)}. Since the value per unit of resources is at least for any agent, putting together with Lemma 4.2 proves the lemma.
Now we are ready to introduce the reduction from differentially private query release to jointly differentially private packing. By solving the constructed packing instance in an -jointly differentially private manner, the allocation for agents in is -differentially private w.r.t. the data of agents in according to the definition of joint differential privacy. Then, we can output minus the number of units of resource allocated to agents in , denoted as , as the response for query . Since this is a post-processing on the output of an -differentially private algorithm, the responses are -differentially private as well.
It remains to analyze the accuracy of the responses. On one hand, is greater than or equal to the number of units of resource allocated to agents in , which, by Lemma 4.3 is at least . On the other hand, is at most the number of units of resource allocated to agents in plus the number of unallocated units of resource . The former is at most while the latter is at most on average according to Lemma 4.4. Putting together ’s have average error at most .
5 Private dual online multiplicative weight algorithm
In this section we introduce an alternative algorithm for solving the packing problem in a jointly differentially private manner. This alternative approach is similar to the previous one, with the following differences. In each step, instead of computing the best responses of all agents for the current dual prices and, thus, compute the corresponding subgradient, we simply pick one of the agents and use his best response to compute a proxy subgradient. The agent then gets the bundle specified by his best response. We will choose a random ordering of the agents at the beginning and pick agents in that order. As a result, the algorithm will update dual prices for only rounds as oppose to rounds in the previous approach.
There are both pros and cons of this alternative approach.
- •
The main disadvantage is that it requires a much larger supply to get the same approximation guarantees. Intuitively, this is because (1) we use proxy subgradients in place of the actual subgradients and, thus, introduce some extra error, and (2) it goes over the each agent only once and, thus, does not optimize the number of rounds of dual updates.
- •
For the same two reasons that cause the above drawback, the approach in this section has the advantage that we get incentive compatibility for free if we charge the agent the corresponding dual prices in his round, since every agent gets the best response bundle.
- •
Further, it can be implemented in the online random-arrival setting, where the agents show up one by one in a random order and the algorithm must decide the allocation to each agent at his arrival.
- •
Last but not least, the approach in this section can be implemented in an -jointly differentially private manner. Neither the dual multiplicative weight update approach in Section 3 nor the approach in previous work [12] can achieve -joint differential privacy.
5.1 Proof of Theorem 1.3 (privacy)
By standard privacy properties of the Laplace mechanism and the composition theorem, the noisy demand vectors ’s are -differentially private if the noise scale is , and are -differentially private if the noise scale is . Then, the joint differential privacy of Algorithm 2 follows by the Billboard Lemma (Lemma 2.4).
5.2 Proof of Theorem 1.3 (approximation)
5.2.1 Key lemma
We will first establish a key lemma that is an analogue of Lemma 3.7 in the previous section.
Lemma 5.1
For any given such that , with high probability, we have:
[TABLE]
The proof of Lemma 5.1 follows by a sequence of technical lemmas as follows.
Lemma 5.2
For any with , and , we have:
[TABLE]
We will omit the proof of the above lemma because it is essentially the same as that of Lemma 3.4, replacing with .
Next, we decompose the Lagrangian objective into the sum of components ’s, , as follows:
[TABLE]
Then, we have:
[TABLE]
Lemma 5.3
For any with , and , any fixed permutation and over the randomness of the Laplacian noise, we have that with high probability:
[TABLE]
- Proof.
Recall that . By Eqn. (5.3) and Lemma 5.2, we have:
[TABLE]
Summing over , we get that
[TABLE]
Since is a permutation, we have that . Further, recall that our choice of . So the RHS further equals
[TABLE]
Finally, let us consider the randomness of the Laplacian noise. We will use the concentration bound for martingales [2] to bound the last term. By Lemma 3.5, the last term on the RHS is at most with high probability. Rearrange terms and the lemma follows.
Lemma 5.4
With high probability, we have:
[TABLE]
- Proof.
We will proceed in two steps. Firstly we will show that the expectation of the LHS is at least . Then, we will use the standard concentration bound for martingales to bound the deviation of the LHS from its expectation.
For any , let us fix the randomness in the first rounds and, thus, fix . Taking expectation over only the randomness of round , we get that:
[TABLE]
Let be the offline optimal primal solution. Since is the best response to , we have:
[TABLE]
Next, consider the difference between the above quantity and the actual average over agents, i.e., . We have:
[TABLE]
Note that is a random subset of elements in . By the standard concentration bound for sampling without replacement [1], with high probability over the randomness of , we have that
[TABLE]
and for any
[TABLE]
Putting together, we have:
[TABLE]
where the last inequality follows by the optimality of .
Summing over , noting that , we have:
[TABLE]
Then, consider a sequence of random variables ’s as follows:
[TABLE]
Note that , so is a martingale so we have that with high probability
[TABLE]
Summing over together with Eqn. (5.4) proves the lemma.
Putting together Lemma 5.3 and Lemma 5.4 proves Lemma 5.1.
5.2.2 Approximate optimality
Let and for all . By Lemma 5.1, the following holds with high probability:
[TABLE]
Plug in our choice of parameters, the RHS further equals
[TABLE]
So we have the desired approximate optimality guarantee.
5.2.3 Approximate feasibility
We choose to penalize the over-demands and, thus, make as small as possible. We let , where is the most over-demanded constraint and let for any . Let be the over-demand of . By Lemma 5.1 and the choice of and , with high probability we have:
[TABLE]
By the choice of , we further get that:
[TABLE]
Note that , because increasing the supply per resource from to increases the optimal packing objective by at most a factor. So we have:
[TABLE]
where the last inequality is due to , b\geq\tilde{O}\big{(}\frac{\sqrt{n}\sigma}{\alpha}\big{)}, and . So we have that:
[TABLE]
So we have . Recall that b\geq\tilde{O}\big{(}\frac{\sqrt{n}\sigma}{\alpha}\big{)}, we have .
A Missing proofs in Section 3
A.1 Proof of Lemma 3.4
- Proof.
By the definition of KL-divergence, we get that:
[TABLE]
where the last equality is due to . Next, we bound the two terms separately. The first term equals:
[TABLE]
By the definition of , we have that
[TABLE]
and similarly \tfrac{\eta}{p_{\max}}\big{\langle}p^{(t)},\nabla\bar{D}(p^{(t)})\big{\rangle}\geq-\tfrac{1}{2}. Note that for any , we have:
[TABLE]
Now we bound the second term using inequalities for any and for any .
[TABLE]
Then, we further upper bound the above using and , and get that:
[TABLE]
Finally, by that , and for any , \big{\langle}p,|\nabla\bar{D}(p^{(t)})|\big{\rangle} is upper bounded by . So we have:
[TABLE]
Putting (A.1), (A.2), and (A.3) together proves the lemma.
A.2 Proof of Lemma 3.5
- Proof.
In our case, ’s are unbounded but have exponentially small tail contributions. We follow the standard strategy of proving Azuma-Hoeffding type of concentration bounds. By symmetry of the random variables ’s, it suffices to show the first inequality.
Let X_{k}=\sum_{t=0}^{k}\big{\langle}q^{(t)},v^{(t)}\big{\rangle} for . For a positive whose value will be determined later, we have:
[TABLE]
Next, we upper bound \operatorname{{E}}\big{[}\exp\big{(}\lambda\big{(}X_{T}-X_{0}\big{)}\big{)}\big{]}, which can be rewritten as:
[TABLE]
For any , we have that:
[TABLE]
Further, for any , we have:
[TABLE]
The last second inequality holds for any . Hence, we have:
[TABLE]
Next, note that \sum_{j=1}^{m}(q^{(t)}_{j})^{2}\leq\big{(}\sum_{j=1}^{m}q^{(t)}_{j}\big{)}^{2}=p_{\max}^{2}. We have:
[TABLE]
and, thus,
[TABLE]
Thus, we get that:
[TABLE]
The lemma then follows by choosing
[TABLE]
A.3 Proof of Lemma 3.6
- Proof.
By symmetry, it suffices to show the first inequality by symmetry of the random variables ’s. For a positive to be determined later, we have:
[TABLE]
Next, we bound E\left[~{}\exp{\left(\lambda\left(q^{(t)}_{j}\cdot\max\big{\{}0,v^{(t)}_{j}-\tfrac{\ln(T)}{\epsilon^{\prime}}\big{\}}\right)\right)}\right] by:
[TABLE]
where the second last inequality is due to . Thus, we have:
[TABLE]
So we have:
[TABLE]
and, thus,
[TABLE]
which equals due to our choice of .
A.4 Proof of Lemma 3.7
- Proof.
By Lemma 3.2, we have:
[TABLE]
Then, applying 3.4, we further get that:
[TABLE]
Putting together, we have:
[TABLE]
Next, we bound the last two terms separately. By the definition of , the last term can be rewritten as:
[TABLE]
Applying Lemma 3.5 twice, we get that with probability at least ,
[TABLE]
As for the second last term, we can upper bound it as:
[TABLE]
Note that by the definition of , \big{|}\nabla_{j}\hat{D}(p^{(t)})-\nabla_{j}\bar{D}(p^{(t)})\big{|} can be rewritten as:
[TABLE]
Further, . So the above is at most (recall that ):
[TABLE]
Applying Lemma 3.6 twice, we get that with probability at least ,
[TABLE]
Putting together (A.4), (A.5), and (LABEL:eq:6) proves the lemma.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Shipra Agrawal and Nikhil R. Devanur. Fast algorithms for online stochastic convex programming. In SODA , pages 1405–1424. SIAM, 2015.
- 2[2] Kazuoki Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, Second Series , 19(3):357–367, 1967.
- 3[3] Raef Bassily, Adam D. Smith, and Abhradeep Thakurta. Private empirical risk minimization: Efficient algorithms and tight error bounds. In FOCS , pages 464–473. IEEE Computer Society, 2014.
- 4[4] Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to noninteractive database privacy. J. ACM , 60(2):12:1–12:25, 2013.
- 5[5] Niv Buchbinder and Joseph Naor. Online primal-dual algorithms for covering and packing. Mathematics of Operations Research , 34(2):270–286, 2009.
- 6[6] Kamalika Chaudhuri and Daniel J. Hsu. Sample complexity bounds for differentially private learning. In COLT , volume 19 of JMLR Proceedings , pages 155–186. JMLR.org, 2011.
- 7[7] Cynthia Dwork, Frank Mc Sherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In TCC , volume 3876 of Lecture Notes in Computer Science , pages 265–284. Springer, 2006.
- 8[8] Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy. In FOCS , pages 51–60. IEEE Computer Society, 2010.
