Exploring Differential Obliviousness

Amos Beimel; Kobbi Nissim; Mohammad Zaheri

arXiv:1905.01373·cs.CR·October 4, 2019

Exploring Differential Obliviousness

Amos Beimel, Kobbi Nissim, Mohammad Zaheri

PDF

TL;DR

This paper investigates the concept of differential obliviousness, a relaxed privacy notion for algorithms, demonstrating its potential benefits in property testing and tasks with input-dependent exploration, beyond traditional oblivious algorithms.

Contribution

It extends the understanding of differential obliviousness by analyzing its advantages in property testing and input-dependent tasks, highlighting scenarios where it outperforms full obliviousness.

Findings

01

Differential obliviousness offers nearly linear overhead improvements in dense graph property testing.

02

Quadratic overhead improvements are possible in bounded degree graph models.

03

Differential obliviousness can maintain input-dependent exploration behaviors, unlike full obliviousness.

Abstract

In a recent paper Chan et al. [SODA '19] proposed a relaxation of the notion of (full) memory obliviousness, which was introduced by Goldreich and Ostrovsky [J. ACM '96] and extensively researched by cryptographers. The new notion, differential obliviousness, requires that any two neighboring inputs exhibit similar memory access patterns, where the similarity requirement is that of differential privacy. Chan et al. demonstrated that differential obliviousness allows achieving improved efficiency for several algorithmic tasks, including sorting, merging of sorted lists, and range query data structures. In this work, we continue the exploration and mapping of differential obliviousness, focusing on algorithms that do not necessarily examine all their input. This choice is motivated by the fact that the existence of logarithmic overhead ORAM protocols implies that differential…

Equations49

Pr [Exp_{0}^{A, M} (λ, n) = 1] \leq e^{ε} \cdot Pr [Exp_{1}^{A, M} (λ, n) = 1] + δ .

Pr [Exp_{0}^{A, M} (λ, n) = 1] \leq e^{ε} \cdot Pr [Exp_{1}^{A, M} (λ, n) = 1] + δ .

dist_{d} (G_{1}, G_{2}) ≜ \frac{∣ {( v , i ) : v \in V , i \in [ d ] , f _{G_{1}} ( v , i ) \neq = f _{G_{2}} ( v , i )} ∣}{d n} .

dist_{d} (G_{1}, G_{2}) ≜ \frac{∣ {( v , i ) : v \in V , i \in [ d ] , f _{G_{1}} ( v , i ) \neq = f _{G_{2}} ( v , i )} ∣}{d n} .

Pr [\mbox \sc T es t er \mbox p r o b es 2 n o d es w h ose d i s t an ce i s 1 or 2 o n perm (G_{1})] \geq Pr [\mbox \sc T es t er \mbox p r o b es 2 n o d es w h ose d i s t an ce i s 1 or 2 perm (G_{2})] \geq 1/2.

Pr [\mbox \sc T es t er \mbox p r o b es 2 n o d es w h ose d i s t an ce i s 1 or 2 o n perm (G_{1})] \geq Pr [\mbox \sc T es t er \mbox p r o b es 2 n o d es w h ose d i s t an ce i s 1 or 2 perm (G_{2})] \geq 1/2.

Pr [A (H_{1}, H_{2}, Access^{\mbox \sc T es t er} (H_{1})) = 1] \leq e^{4 ε} \cdot Pr [A (H_{1}, H_{2}, Access^{\mbox \sc T es t er} (H_{2})) = 1] + 4 e^{4 ε} δ .

Pr [A (H_{1}, H_{2}, Access^{\mbox \sc T es t er} (H_{1})) = 1] \leq e^{4 ε} \cdot Pr [A (H_{1}, H_{2}, Access^{\mbox \sc T es t er} (H_{2})) = 1] + 4 e^{4 ε} δ .

\frac{1}{2 n} - O (q^{2} / n) \leq Pr [A (H_{1}) = 1] \leq e^{4 ε} Pr [A (H_{2}) = 1] + e^{4 ε} δ \leq e^{4 ε} O (q^{2} / n^{2}) + e^{4 ε} δ .

\frac{1}{2 n} - O (q^{2} / n) \leq Pr [A (H_{1}) = 1] \leq e^{4 ε} Pr [A (H_{2}) = 1] + e^{4 ε} δ \leq e^{4 ε} O (q^{2} / n^{2}) + e^{4 ε} δ .

ε^{'} = t = 1 \sum n \frac{ε}{t lo g n} \leq \frac{ε}{lo g n} (ln n + 1) \leq ε,

ε^{'} = t = 1 \sum n \frac{ε}{t lo g n} \leq \frac{ε}{lo g n} (ln n + 1) \leq ε,

Pr [B (x) \in S]

Pr [B (x) \in S]

\leq e^{ε} Pr [A (y) \in S] + γ

\leq e^{ε} (Pr [B (y) \in S] + γ) + γ

= e^{ε} Pr [B (y) \in S] + (1 + e^{ε}) γ .

Pr [\mbox \sc T es t er_{T}^{'} (G) = 1]

Pr [\mbox \sc T es t er_{T}^{'} (G) = 1]

\leq a \sum Pr [\tilde{T} = a] Pr [c (G^{'}) > a - 1]

\leq e^{ε} a \sum Pr [\tilde{T} = a - 1] Pr [c (G^{'}) > a - 1]

\leq e^{ε} Pr [\mbox \sc T es t er_{T}^{'} (G^{'}) = 1] .

e^{- ε} Pr [\tilde{T}_{ℓ_{τ}} = a] \leq Pr [\tilde{T}_{ℓ_{τ}} = a - 2 lo g (2/ δ)] \leq e^{ε} Pr [\tilde{T}_{ℓ_{τ}} = a]

e^{- ε} Pr [\tilde{T}_{ℓ_{τ}} = a] \leq Pr [\tilde{T}_{ℓ_{τ}} = a - 2 lo g (2/ δ)] \leq e^{ε} Pr [\tilde{T}_{ℓ_{τ}} = a]

Pr

Pr

= τ = (\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}) \sum Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x) \in S ∣ \tilde{T}_{1}, \dots, \tilde{T}_{l o g n}] Pr [\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}]

= τ = (\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}) \sum Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x^{'}) \in S ∣ \tilde{T}_{1}, \dots, \tilde{T}_{ℓ_{τ} - 1}, \tilde{T}_{ℓ_{τ}} - 2 lo g (2/ δ), \tilde{T}_{ℓ_{τ} + 1}, \dots, \tilde{T}_{l o g n}]

\cdot Pr [\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}]

\leq e^{ε} τ = (\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}) \sum Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x^{'}) \in S ∣ \tilde{T}_{1}, \dots, \tilde{T}_{ℓ_{τ} - 1}, \tilde{T}_{ℓ_{τ}} - 2 lo g (2/ δ), \tilde{T}_{ℓ_{τ} + 1}, \dots, \tilde{T}_{l o g n}]

\cdot Pr [\tilde{T}_{1}, \dots, \tilde{T}_{ℓ_{τ} - 1}, \tilde{T}_{ℓ_{τ}} - 2 lo g (2/ δ), \tilde{T}_{ℓ_{τ} + 1}, \dots, \tilde{T}_{l o g n}]

= e^{ε} Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x^{'}) \in S] .

Pr

Pr

\geq e^{- ε} τ = (\tilde{T}_{1}, \dots, \tilde{T}_{l o g n}) \sum Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x^{'}) \in S ∣ \tilde{T}_{1}, \dots, \tilde{T}_{ℓ_{τ} - 1}, \tilde{T}_{ℓ_{τ}} - 2 lo g (2/ δ), \tilde{T}_{ℓ_{τ} + 1}, \dots, \tilde{T}_{l o g n}]

\cdot Pr [\tilde{T}_{1}, \dots, \tilde{T}_{ℓ_{τ} - 1}, \tilde{T}_{ℓ_{τ}} - 2 lo g (2/ δ), \tilde{T}_{ℓ_{τ} + 1}, \dots, \tilde{T}_{l o g n}]

= e^{- ε} Pr [Access^{\mbox \sc L oc a t e_{P}^{'}} (x^{'}) \in S] .

∣ Pr [

∣ Pr [

\displaystyle=\Big{|}\Pr[\mathsf{Access}^{\mbox{Locate}_{\cal P}}(\mathbf{x})\in S|A]\Pr[A]+\Pr[\mathsf{Access}^{\mbox{Locate}_{\cal P}}(\mathbf{x})\in S|\bar{A}]\Pr[\bar{A}]

\displaystyle\quad\quad-\Pr[\mathsf{Access}^{\mbox{\sc Locate}^{\prime}_{\cal P}}(\mathbf{x})\in S|A]\Pr[A]-\Pr[\mathsf{Access}^{\mbox{\sc Locate}^{\prime}_{\cal P}}(\mathbf{x})\in S|\bar{A}]\Pr[\bar{A}]\Big{|}

\displaystyle=\Big{|}\Pr[\mathsf{Access}^{\mbox{Locate}_{\cal P}}(\mathbf{x})\in S|A]-\Pr[\mathsf{Access}^{\mbox{\sc Locate}^{\prime}_{\cal P}}(\mathbf{x})\in S|A]\Big{|}\Pr[A]

\leq Pr [A] \leq δ .

(2 T m) (1 - p)^{m - 2 t} \leq m^{2 T} e^{- (m - 2 T) p} \leq e^{- (m - 2 T) p + 2 T l n m} .

(2 T m) (1 - p)^{m - 2 t} \leq m^{2 T} e^{- (m - 2 T) p} \leq e^{- (m - 2 T) p + 2 T l n m} .

max_{1} - min_{1} \leq 1 + (2 \cdot \frac{l o g 1/ β ^{'}}{ε ^{'}} + 1) \cdot \frac{m a x _{0} - m i n _{0}}{\frac{4 l o g ( 1/ β ^{'} )}{ε ^{'}}} \leq 3 \cdot \frac{l o g 1/ β ^{'}}{ε ^{'}} \cdot \frac{m a x _{0} - m i n _{0}}{\frac{4 l o g ( 1/ β ^{'} )}{ε ^{'}}} = \frac{3 ( m a x _{0} - m i n _{0} )}{4} .

max_{1} - min_{1} \leq 1 + (2 \cdot \frac{l o g 1/ β ^{'}}{ε ^{'}} + 1) \cdot \frac{m a x _{0} - m i n _{0}}{\frac{4 l o g ( 1/ β ^{'} )}{ε ^{'}}} \leq 3 \cdot \frac{l o g 1/ β ^{'}}{ε ^{'}} \cdot \frac{m a x _{0} - m i n _{0}}{\frac{4 l o g ( 1/ β ^{'} )}{ε ^{'}}} = \frac{3 ( m a x _{0} - m i n _{0} )}{4} .

e^{- ε^{'}} \leq e^{- ∣ I (x) - I (x^{'}) ∣ ε^{'}} \leq \frac{Pr [ Lap ( 1/ ε ^{'} ) + I ( x ) \in A ]}{Pr [ Lap ( 1/ ε ^{'} ) + I ( x ^{'} ) \in A ]} \leq e^{∣ I (x) - I (x^{'}) ∣ ε^{'}} \leq e^{ε^{'}} .

e^{- ε^{'}} \leq e^{- ∣ I (x) - I (x^{'}) ∣ ε^{'}} \leq \frac{Pr [ Lap ( 1/ ε ^{'} ) + I ( x ) \in A ]}{Pr [ Lap ( 1/ ε ^{'} ) + I ( x ^{'} ) \in A ]} \leq e^{∣ I (x) - I (x^{'}) ∣ ε^{'}} \leq e^{ε^{'}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Exploring Differential Obliviousness††thanks: Work supported by NSF grant No. 1565387 TWC: Large: Collaborative: Computing Over Distributed Sensitive Data.

Amos Beimel Dept. of Computer Science, Ben-Gurion University, Israel. [email protected]. Work done while A.B. was visiting Georgetown University.

Kobbi Nissim Dept. of Computer Science, Georgetown University. [email protected].

Mohammad Zaheri Dept. of Computer Science, Georgetown University. [email protected].

In a recent paper, Chan et al. [SODA ’19] proposed a relaxation of the notion of (full) memory obliviousness, which was introduced by Goldreich and Ostrovsky [J. ACM ’96] and extensively researched by cryptographers. The new notion, differential obliviousness, requires that any two neighboring inputs exhibit similar memory access patterns, where the similarity requirement is that of differential privacy. Chan et al. demonstrated that differential obliviousness allows achieving improved efficiency for several algorithmic tasks, including sorting, merging of sorted lists, and range query data structures.

In this work, we continue the exploration of differential obliviousness, focusing on algorithms that do not necessarily examine all their input. This choice is motivated by the fact that the existence of logarithmic overhead ORAM protocols implies that differential obliviousness can yield at most a logarithmic improvement in efficiency for computations that need to examine all their input. In particular, we explore property testing, where we show that differential obliviousness yields an almost linear improvement in overhead in the dense graph model, and at most quadratic improvement in the bounded degree model. We also explore tasks where a non-oblivious algorithm would need to explore different portions of the input, where the latter would depend on the input itself, and where we show that such a behavior can be maintained under differential obliviousness, but not under full obliviousness. Our examples suggest that there would be benefits in further exploring which class of computational tasks are amenable to differential obliviousness.

1 Introduction

A program’s memory access pattern can leak significant information about the private information used by the program even if the memory content is encrypted. Such leakage can turn into a data protection problem in various settings. In particular, where data is outsourced to be stored on an external server, it has been shown that access pattern leakage can be exploited in practical attacks and lead to the compromise of the underlying data [20, 4, 29, 21, 23]. Such leakages can also be exploited when a program is executed in a secure enclave environment but needs to access memory that is external to the enclave.

Memory access pattern leakage can be avoided by employing a strategy that makes the sequence of memory accesses (computationally or statistically) independent of the content being processed. Beginning with the seminal work of Goldreich and Ostrovsky, it is well known how to transform any program running on a random access memory (RAM) machine to one with an oblivious memory access pattern while retaining efficiency by using an Oblivious RAM protocol (ORAM) [10, 30, 13]. Current state-of-the-art ORAM protocols achieve logarithmic overhead [2], matching a recent lowerbound by Larsen and Nielsen [24], and protocols with $O(1)$ overhead exist when the server is allowed to perform computation and large blocks are retrieved [6, 28]. To further reduce the overhead, oblivious memory access pattern protocols have been devised for specific tasks, including graph algorithms [3, 17], geometric algorithms [8] and sorting [16, 25]. The latter is motivated by sorting being a fundamental and well researched computational task as well as its ubiquity in data processing.

1.1 Differential Obliviousness

Full obliviousness is rather a strong requirement: any two possible inputs (of the same size) should exhibit identical or indistinguishable sequences of memory accesses. Achieving full obliviousness via a generic use of ORAM protocols requires a setup phase with running time (at least) linear in the memory size and then a logarithmic overhead per each memory access.

A recent work by Chan, Chung, Maggs, and Shi [5] put forward a relaxation of the obliviousness requirement where indistinguishability is replaced with differential privacy. Intuitively, this means that any two possible neighboring inputs should exhibit memory access patters that are similar enough to satisfy differential privacy, but may still be too dissimilar to be “cryptographically” indistinguishable. It is not a priori clear whether differential obliviousness can be achieved without resorting to full obliviousness. However, the recent work Chan et al. showed that differential obliviousness does allow achieving improved efficiency for several algorithmic tasks, including sorting (over very small domains), merging of sorted lists, and range query data structures.

Also of relevance are the works by He et al. [19] and Mazloom and Gordon [27], which study protocols for secure multiparty computation in which the parties are allowed to learn information from the computation as long as this information preserves the differential privacy of the input. He et al. and Mazloom and Gordon demonstrate that this leakage is useful: He et al. construct protocols for the private record linkage problem for two databases; Mazloom and Gordon present protocols for histograms, PageRank, and matrix factorization.

Furthermore, even the use of ORAM protocols may be insufficient for preventing leakage in cases where the number of memory probes is input dependent. In fact, Kellaris et al. [21] show that such leakage can result in a complete reconstruction in the case of retrieving elements specified by range queries, as the number of records returned depends on the contents of the data structure. Full obliviousness would require that the sequence of memory accesses would be padded to a maximal one to avoid such leakage, a solution that would have a dire effect on the efficiency of many algorithms. Differential obliviousness may in some cases allow achieving meaningful privacy while maintaining efficiency. Examples of such protocols include the combination of ORAM with differentially private sanitization by Kellaris et al. [22] and the recent work of Chan et al. [5] on range query data structures, which avoids using ORAM.

1.2 This Work: Exploring Differential Obliviousness

Noting that the existence of logarithmic overhead ORAM protocols implies that differential obliviousness can yield at most a logarithmic improvement in efficiency for computations that need to examine all their input, we explore tasks where this is not the case. In particular, we focus on property testing and on tasks where the number of memory accesses can depend on the input.

Property testing.

As evidence that differential obliviousness can provide a significant improvement over full obliviousness, we show in Section 3 that property testers in the dense graph model, where the input is in the adjacency matrix representation [12], can be made differentially oblivious. This result captures a large set of testable graph properties [12, 1] including, e.g., graph bipartitness and having a large clique. Testers in this class probe a uniformly random subgraph and hence are fully oblivious without any modification, as their access pattern does not depend on the input graph. However, this is not the case if the tester reveals its output to the adversary, as this allows learning information about the specific probed subgraph. A fully oblivious tester would need to access a linear-sized subgraph, whereas we show that a differentially oblivious tester only needs to apply the original tester $O(1)$ times.111We omit dependencies on privacy and accuracy parameters from this introductory description.

We also consider property testing in the bounded degree model, where the input is in the incidence lists model [14]. In this model we provide negative results, demonstrating that adaptive testers cannot, generally, be made differentially oblivious without a significant loss in efficiency. In particular, in Section 4 we consider differentially oblivious property testers for connectivity in graphs of degree at most two. For non-oblivious testers, it is known that constant number of probes suffice when the tester is adaptive [14].222In an adaptive tester at least one choice of a node to probe should depend on information gathered from incidence lists of previously probed nodes. It is also known that any non-adaptive tester for this task requires probing $\Omega(\sqrt{n})$ nodes [32]. We show that this lowerbound extends to differentially oblivious testers, i.e., any differentially oblivious tester for connectivity in graphs of maximal degree $2$ requires $\Omega(\sqrt{n})$ probes. While this still improves over full obliviousness, the gap between full and differential obliviousness is in this case diminished.

Locating an Object Satisfying a Property.

Here, our goal is to check whether a given data set of objects includes an object that satisfies a specified property. Without obliviousness requirements, a natural approach is to probe elements in a random order until an element satisfying the property is found or all elements were probed. If a $p$ fraction of the elements satisfy the property, then the expected number of probes is $1/p$ . This algorithm is in fact instance optimal when the data set is randomly permuted.333Our treatment of instance optimality is rather informal. The concept was originally presented in [9].

A fully oblivious algorithm would require $\Omega(n)$ probes on any dataset even when $p=1$ . In contrast, we demonstrate in Section 5 that with differential obliviousness instance optimality can, to a large extent, be preserved. Our differentially oblivious algorithm always returns a correct answer and makes at most $m$ probes with probability at least $1-e^{-O(mp)}$ .

Prefix Sum.

Our last example considers a sorted dataset (possibly, the result of an earlier phase in the computation). Our goal is to compute the sum of all records in the (sorted) dataset that are less than or equal to a given value $a$ (see Section 6 for the definition of privacy).

Without obliviousness requirements, one can find the greatest record less than or equal to value $a$ , say, using binary search, and then compute the prefix sum by a quick scan through all records appearing before this record. This algorithm is in fact nearly instance optimal, as it can be shown that any algorithm which returns the correct exact answer with non-negligible probability must probe all entries greater than $a$ . However, fully oblivious algorithms would have to probe the entire dataset.

In Section 6, we give our nearly instance optimal differentially oblivious prefix sum algorithm. As the probes of a binary search would leak information about the memory content, we introduce a differentially oblivious “simulation” of the binary search. Our differentially oblivious binary search runs in time $O(\log^{2}n)$ .

We also address the scenario where there are multiple prefix sum queries to the same database. If the number of queries is bounded by some integer $t$ , then each differentially oblivious binary search will run in time $O(t\log^{2}n)$ (as we need to run the search algorithm with a smaller privacy parameter $\varepsilon$ ). Using ORAM, one can answer such queries with $O(n\log n)$ prepossessing time and $O(\log^{2}n)$ time per query. Combining our algorithm and ORAM, we can amortize the pre-processing time over $O(\sqrt{n})$ queries, that is, without any pre-processing, the running of time of answering the $i$ -th query is $O(i\log^{4}n)$ for the first $O(\sqrt{n})$ queries and $O(\log^{2}n)$ for any further query.

1.3 Background Work

The papers by Chan, Chung, Maggs, and Shi [5], He, Machanavajjhala, Flynn, and Srivastava [19], and by Mazloom and Gordon [27] mentioned above are most relevant for this article. As mentioned above, Kellaris et al. [22] examined a similar concept with the goal of preventing reconstruction attacks in secure remote databases. Goldreich, Goldwasser, and Ron [12] initiated the research on graph property testing. Persiano and Yeo [31] showed that the $O(\log n)$ lowerbound for ORAM of [24] also holds when the security requirement is relaxed to differetial privacy. Goldreich’s book on property testing [11] gives sufficient background for our discussion. Dwork, McSherry, Nissim, and Smith [7] defined differential privacy. For more details on ORAM and a list of relevant papers, the reader can consult [2].

2 Definitions

2.1 Model of Computation

We consider the standard Random Access Memory (RAM) model of computation that consists of a CPU and a memory. The CPU executes a program and is allowed to perform two types of memory operations: read a value from a specified physical address, and write a value to a specified physical address. We assume that the CPU has a private cache of where it can store $O(1)$ values (and/or a polylogarithmic number of bits). As an example, in the setting of a client storing its data on the cloud, the client plays the role of the CPU and the cloud server plays the role of the memory.

We assume that a program’s sequence of read and write operations may be visible to an adversary. We will call this sequence the program’s access pattern. We will further assume that the memory content is encrypted so that no other information is leaked about the content read from and stored in memory location. The program’s access pattern may depend on the program’s input, and may hence leak information about it.

2.2 Oblivious Algorithms

There are various works focused on oblivious algorithms [8, 15, 26] and Oblivious RAM (ORAM) constructions [13]. These works adopt “full obliviousness” as a privacy notion. Suppose that $M(\lambda,\mathbf{x})$ is an algorithm that takes in two inputs, a security parameter $\lambda$ and an input dataset denoted $\mathbf{x}$ . We denote by $\mathsf{Access}^{M}(\lambda,\mathbf{x})$ , the ordered sequence of memory accesses the algorithm $M$ makes on the input $\lambda$ and $\mathbf{x}$ .

Definition 2.1 (Fully Oblivious Algorithms).

Let $\delta$ be a function in a security parameter $\lambda$ . We say that algorithm $M$ is $\delta$ -statistically oblivious, iff for all inputs $\mathbf{x}$ and $\mathbf{y}$ of equal length, and for all $\lambda$ , it holds that $\mathsf{Access}^{M}(\lambda,\mathbf{x})\approx^{\delta(\lambda)}\mathsf{Access}^{M}(\lambda,\mathbf{y})$ where $\approx^{\delta(\lambda)}$ denotes that the two distributions have at most $\delta(\lambda)$ statistical distance. We say that $M$ is perfectly oblivious when $\delta=0$ .

2.3 Differentially Oblivious Algorithms

Suppose that $M(\lambda,\mathbf{x},q)$ is an (stateful) algorithm that takes in three inputs, a security parameter $\lambda>0$ , an input dataset denoted by $\mathbf{x}$ and a value $q$ . We slightly change the definition of differentially oblivious algorithms given in [5]:

Definition 2.2 (Neighbor-respecting).

We say that two input datasets $\mathbf{x}$ and $\mathbf{y}$ are neighboring iff they are of the same length and differ in exactly one entry. We say that $A=(A_{1},A_{2})$ is neighbor-respecting adversary iff for every $\lambda$ and every $n$ , $A_{1}$ outputs neighboring datasets $\mathbf{x}_{0},\mathbf{x}_{1}$ , with probability 1.

Definition 2.3.

Let $\varepsilon,\delta$ be privacy parameters. Let $M$ be an (possibly stateful) algorithm described as above. To an adversary $A$ we associate the experiment in Figure 1, for every $\lambda\in{{\mathbb{N}}}$ . We say that $M$ is $(\varepsilon,\delta)$ -adaptively differentially oblivious if for all (computationally unbounded) stateful neighbor-respecting adversary $A$ we have

[TABLE]

In Figure 1, $\mathsf{Access}^{M}(\mathbf{x},q,state)$ denotes the ordered sequence of memory accesses the algorithm $M$ makes on the inputs $\mathbf{x},q$ and $state$ .

Remark 2.4.

The notion of adaptivity here is different from the one defined in [5]. We require that the dataset $\mathbf{x}$ remain the same through the experiment whereas in [5] the adaptive adversary can add or remove entries from the dataset.

As with differential privacy, we usually think about $\varepsilon$ as a small constant and require that $\delta=o(1/n)$ where $n=|\mathbf{x}|$ [7]. Observe that if $M$ is $\delta$ -statistically oblivious then it is also $(0,\delta)$ -differentially oblivious.

The following simple lemma will be useful to analyze our algorithms. The proof of the lemma appears in Appendix A.

Lemma 2.5.

Let ${\cal A}$ be an $(\varepsilon,0)$ -differentially oblivious algorithm and ${\cal B}$ be an algorithm such that for every dataset $\mathbf{x}$ the statistical distance between ${\cal A}(\mathbf{x})$ and ${\cal B}(\mathbf{x})$ is at most $\gamma$ (that is, $|\Pr[{\cal A}(\mathbf{x})\in S]-\Pr[{\cal B}(\mathbf{x})\in S]|\leq\gamma$ for every $S$ ). Then, ${\cal B}$ is an $(\varepsilon,(1+e^{\varepsilon})\gamma)$ -differentially oblivious algorithm.

3 Differentially Oblivious Property Testing of Dense Graphs Properties

In this section, we present a differentially oblivious property tester for dense graphs properties in the adjacency matrix representation model. A property tester is an algorithm that decides whether a given object has a predetermined property or is far from any object having this property by examining a small random sample of its input. The correctness requirement of property testers ignores objects that neither have the property nor are far from having the property. However, the privacy requirement is “worst case” and should hold for any two neighboring graphs. For the definition of privacy we say that two graphs $G,G^{\prime}$ of size $n$ are neighbors if one can get $G^{\prime}$ by changing the neighbors of exactly one node of $G$ .

Property testing of graph properties in the adjacency matrix representation was introduced in [12]. A graph $G=(V,E)$ is represented by the predicate $f_{G}:V\times V\rightarrow\{0,1\}$ such that $f_{G}(u,v)=1$ if and only if $u$ and $v$ are adjacent in $G$ . The distance between graphs is defined to be the number of different matrix entries over $|V|^{2}$ . This model is most suitable for dense graphs where the number of edges is $O(|V|^{2})$ . We define a property ${\cal P}$ of graphs to be a subset of the graphs. We write $G\in{\cal P}$ to show that graph $G$ has the property ${\cal P}$ . For example, we can define the bipartiteness property, where ${\cal P}$ is the set of all bipartite graphs.444 Recall that an undirected graph is bipartite (or 2-colorable) if its vertices can be partitioned into two parts, $V_{1}$ and $V_{2}$ , such that each part is an independent set (i.e., $E\subseteq\{(u,v):(u,v)\in V_{1}\times V_{2}\}$ ).

We say that an $n$ -vertex $G$ is $\gamma$ -far from ${\cal P}$ if for every $n$ -vertex graph $G^{\prime}=(V^{\prime},E^{\prime})\in{\cal P}$ it holds that the symmetric difference between $E$ and $E^{\prime}$ is greater than $\gamma n^{2}$ . We define the property testing in this model as follows:

Definition 3.1 ([12]).

A $(\beta,\gamma)$ -tester for a graph property ${\cal P}$ is a probabilistic algorithm that, on inputs $n,\beta,\gamma$ , and an adjacency matrix of an $n$ -vertex graph $G=(V,E)$ :

Outputs 1 with probability at least $\beta$ , if $G\in{\cal P}$ . 2. 2.

Outputs 0 with probability at least $\beta$ , if $G$ is $\gamma$ -far from ${\cal P}$ .

We say a tester has one-sided error, if it accepts every graph in ${\cal P}$ with probability 1. We say a tester is non-adaptive if it determines all its queries to adjacency matrix only based on $n,\beta,\gamma$ , and its randomness; otherwise, we say it is adaptive.

Example 3.2 ([12]).

Consider the following $(2/3,\gamma)$ -tester for bipartiteness: Choose a random subset $A\subset V$ of size $\tilde{O}(1/\gamma^{2})$ with uniform distribution and output 1 iff the graph induced by $A$ is bipartite. Clearly, if $G$ is bipartite, then the tester will always return 1. Goldreich et al. [12] proved that if $G$ is $\gamma$ -far from a bipartite graph, then the probability that the algorithm returns 1 is at most $1/3$ .

Recall that in the graph property testing, the tester ${\cal T}$ chooses a random subset of the graph with uniform distribution to test the property ${\cal P}$ . Given the access pattern of the tester ${\cal T}$ , an adversary will learn nothing since it is uniformly random. Thus, the access pattern by itself does not reveal any information about the input graph. However, we assume that the adversary also learns the tester’s output and can hence learn some information about the input graph based on the output of the tester. To protect this information, we run tester ${\cal T}$ for constant number of times and output $1$ iff the number of times ${\cal T}$ outputs $1$ exceed a (randomly chosen) threshold.

Let ${\cal T}$ be a $(\beta,\gamma)$ -tester for a graph property ${\cal P}$ where $\beta\leq 1/4$ . We write $c_{\beta,\gamma}$ for the number of nodes that ${\cal T}$ samples. Note that $c_{\beta,\gamma}$ is constant in the graph size and a function of $\beta$ and $\gamma$ . For simplicity, we only consider property testers with one-sided error. In Figure 2, we describe a $(\beta,\gamma)$ -tester $\mathsf{Tester}_{{\cal T}}$ that outputs $1$ with probability at least $\beta$ if $G\in{\cal P}$ and outputs 0 with probability at least $\beta$ , if $G$ is $\gamma^{\prime}$ -far from ${\cal P}$ , where $\gamma^{\prime}$ is defined below.

Theorem 3.3.

Let $\varepsilon,\delta>0$ and $\gamma^{\prime}=\gamma-\frac{4\ln(1/2\delta)c_{\beta,\gamma}}{n\varepsilon}$ . Algorithm $\mbox{Tester}_{\cal T}$ is an $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious algorithm that outputs 1 with probability 1 if $G\in{\cal P}$ , and output 0 with probability at least $1-\delta-(2\delta)^{\frac{1}{3\varepsilon}}$ if $G$ is $\gamma^{\prime}$ -far from ${\cal P}$ .

The proof of Theorem 3.3 appears in Section A.2.

4 Lower Bounds on Testing Connectivity

in the Incidence Lists Model

We now consider differentially oblivious testing of connectivity in the incidence lists model [14]. In this model a graph has a bounded degree $d$ and is represented as a function $f:V\times[d]\rightarrow V\cup\{0\}$ , where $f(v,i)$ is the $i$ -th neighbor of $v$ (if no such neighbor exists, then $f(v,i)=0$ ). In this model, the relative distance between graphs is normalized by $dn$ – the maximal number of edges in the graph. Formally, for two graphs with $n$ vertices,

[TABLE]

A $(\beta,\gamma)$ -tester in the incidence lists model is defined as in Definition 3.1, where a property ${\cal P}$ is a set of graphs whose maximal degree is $d$ and the distance to a property is defined with respect to $\operatorname{{\rm dist}}_{d}$ .

Goldreich and Ron [14] showed how to test if a graph is connected in the incidence list model in time $\tilde{O}(1/\gamma)$ . Raskhodnikova and Smith [32] showed that a tester for connectivity (or any non-trivial property) with run-time $o(\sqrt{n})$ has to be adaptive, that is, the nodes that the algorithm probes should depend on the neighbors of nodes the algorithm has already probed (e.g., the algorithm probes some node $u$ , discovers that $v$ is a neighbor and $u$ , and probes $v$ ). We strengthen their results by showing that any tester for connectivity in graphs of maximal degree $2$ and run-time $o(\sqrt{n})$ cannot be a differentially oblivious algorithm. We stress that adaptivity alone is not a reason for inefficiency with differential obliviousness. In fact, there exist differentially oblivious algorithms that are adaptive (e.g., our algorithm in Section 6).

Theorem 4.1.

Let $\varepsilon,\delta>0$ such that $e^{4\varepsilon}\delta<1/16n$ . Every $(\varepsilon,\delta)$ -differentially private $(3/4,1/3)$ -tester for connectivity in graphs with maximal degree 2 runs in time $\Omega(\sqrt{n}/e^{2\varepsilon})$ .

Proof.

Let Tester be a $(3/4,1/3)$ -tester for connectivity in graphs of degree at most $2$ . We somewhat relax the definition of probes and assume that once the tester probes a node, it sees all edges adjacent to this node. We prove that if Tester probes less than $c\sqrt{n}/e^{2\varepsilon}$ nodes (for some constant $c$ ), then it is not $(\varepsilon,\delta)$ -oblivious.

Assume that $n\equiv 0\pmod{3}$ . Let $G_{1}=(V,E_{1})$ be a cycle of length $n$ and $G_{2}=(V,E_{1})$ consist of $n/3$ disjoint triangles. Clearly, $G_{1}$ is connected and $G_{2}$ is $1/3$ -far from a connected graph. For a permutation $\pi:V\rightarrow V$ , define $\pi(G_{i})=(V,\pi(E_{i}))$ , where $\pi(E_{i})=\{(\pi(u),\pi(v)):(u,v)\in E_{i}\}$ , and let $\operatorname{{\rm perm}}(G_{i})$ be a random graph isomorphic to $G_{i}$ , that is, $\operatorname{{\rm perm}}(G_{i})=\pi(G_{i})$ for a permutation $\pi$ chosen with uniform distribution.555 When we permute a graph, we also permute its incident list representation, i.e., if $(u,v)\in\pi(E)$ , then with probability half $v$ will be the first neighbor of $u$ and with probability half it will be the second.

On the random graph $\operatorname{{\rm perm}}(G_{)}$ Tester has to say “yes” with probability at least 3/4 and on the random graph $\operatorname{{\rm perm}}(G_{2})$ Tester has to say “no” with probability at least 3/4.

Observation 4.2.

If Tester does not probe two distinct nodes whose distance is at most two, then Tester sees a collection of paths of length two and cannot know if the graph is $\operatorname{{\rm perm}}(G_{1})$ or $\operatorname{{\rm perm}}(G_{2})$ .

Claim 4.3.

Given the random graph $\operatorname{{\rm perm}}(G_{1})$ , the tester has to probe two distinct nodes whose distance is at most 2 with probability at least $1/2$ .

Proof.

Consider Tester’s answer when it sees a collection of paths of length $2$ . Assume first that the tester returns “No” with probability at least half in this case and let $p$ be the probability that Tester probes two distinct nodes whose distance is at most two on the random graph $\operatorname{{\rm perm}}(G_{1})$ . The probability that Tester returns “Yes” on $\operatorname{{\rm perm}}(G_{1})$ is at most $p+0.5(1-p)=0.5+0.5p$ . Thus, $0.5+0.5p\geq 3/4$ , i.e., $p\geq 0.5$ .

If the tester returns “Yes” with probability at least half, then, by symmetric arguments, with probability at least $1/2$ Tester has to probe two nodes whose distance is at most two on the random graph $\operatorname{{\rm perm}}(G_{2})$ . For a permutation $\pi$ , if the distance between two nodes in $\pi(G_{2})$ is at most 2, then the distance between these two nodes in $\pi(G_{1})$ is at most 2. Thus, by Observation 4.2,

[TABLE]

∎

Denote the nodes of $G_{1}$ by $V=\{v_{0},\dots,v_{n-1}\}$ and define a distribution on pairs of graphs $H_{1},H_{2}$ , obtained by the following process:

•

Choose a permutation $\pi:V\rightarrow V$ with uniform distribution and let $H_{1}=\pi(G_{1})$ .

•

Denote $H_{1}=(V,E_{1})$ and $u_{j}=\pi(v_{j})$ for $j\in[n]$ .

•

Choose with uniform distribution two indices $i,j$ such that $j\in\{i+4,i+3,\dots,i-3\}$ (where the addition is done modulo $n$ ).

•

Let $H_{2}=(V,E_{2})$ , where $E_{2}=E_{1}\setminus\{(u_{i},u_{i+1}),(u_{j},u_{j+1})\}\cup\{(u_{i},u_{j}),(u_{i+1},u_{j+1})\}.$

The graphs are described in Figure 3. Note that $H_{2}$ is also a a random graph isomorphic to $G_{1}$ , thus, given $H_{2}$ one cannot know which pair of non-adjacent nodes $u_{i},u_{j}$ was used to create $H_{2}$ .

Observe that $H_{1}$ and $H_{2}$ differ on $4$ nodes. Since Tester is $(\varepsilon,\delta)$ -differentially oblivious, for every algorithm ${\cal A}$ ,

[TABLE]

Consider the following algorithm ${\cal A}$ :

If $u_{i}$ and at least one of $u_{i+1},u_{i+2}$ is probed by $\mbox{\sc Tester}(H)$ prior to seeing any other pair of nodes of distance at most $2$ in $H_{1}$ or $H_{2}$ , then return $1$ otherwise return [math].

Claim 4.4.

Let $i\in\{1,2\}$ . Suppose that Tester probes at most $q$ nodes. Pick at random with uniform distribution two nodes in $V$ with distance at least $3$ in $H_{i}$ . The probability that $\mbox{\sc Tester}(H_{i})$ probes both $u$ and $v$ prior to seeing any two nodes of distance at most $2$ in $H_{i}$ is $O(q^{2}/n^{2})$ (where the probability is over the random choice of $u,v$ and the randomness of Tester).

Proof.

The node $u$ is a uniformly distributed node in $H_{i}$ and $v$ is any node of distance at least $3$ from $v$ , thus there are $n(n-5)/2$ options for $\{u,v\}$ . Given a collection of paths of length at most $2$ in $H_{i}$ all options are equally likely.

Let $w_{1},\dots,w_{k}$ be the nodes probed in some execution of Tester. Fix some pair of indices $k_{1}<k_{2}$ . The probability that $\{u_{i},u_{i+1}\}=\{w_{k_{1}},w_{k+2}\}$ is at most $1/n(n-5)$ . Thus, the probability that $u$ and $v$ are probed is at most $\frac{\binom{q}{2}}{n(n-5)/2}=O(q^{2}/n^{2}).$ ∎

Claim 4.5.

Assume that Tester probes at most $q$ nodes. The probability that ${\cal A}(H_{1})=1$ is at least $1/2n-O(q^{2}/n^{2})$ .

Proof.

By Claim 4.3, the probability that Tester probes at least one pair of nodes with distance at most $2$ is at least $1/2$ . Given that this event occurs, the probability that the random $u_{i}$ (chosen with uniform distribution) has the smallest index in the first such pair in $H_{1}$ (i.e., the first pair is either $(u_{i},u_{i+1})$ or $(u_{i},u_{i+2})$ ) is at least $1/n$ .

Clearly, given these events no two nodes with distance at most $2$ in $H_{1}$ were probed prior to probing the pair containing $u_{i}$ . Furthermore, there are $O(1)$ pairs of nodes that are of distance at most $2$ in $H_{2}$ and are of distance greater than $2$ in $H_{1}$ . By Claim 4.4, the probability that such pair is probed prior to Tester probing a pair of distance at most $2$ in $H_{1}$ is $O(q^{2}/n^{2})$ . ∎

Claim 4.6.

Suppose that Tester probes at most $q$ nodes. The probability that ${\cal A}(H_{2})=1$ is $O(q^{2}/n^{2})$ .

Proof.

The node $u_{i}$ is a uniformly distributed node in $H_{2}$ . Furthermore, the nodes $u_{i+1}$ is a uniformly distributed node of distance at least $3$ from $u_{i}$ in $H_{2}$ , thus by Claim 4.4, the probability that Tester probes both $u_{i}$ and $u_{i+1}$ prior to seeing any pair of distance at least $2$ in $H_{2}$ is $O(q^{2}/n^{2})$ . This probability can only decrease if we require that Tester probes both $u_{i}$ and $u_{i+1}$ prior to seeing any pair of distance at least $2$ in $H_{1}$ and in $H_{2}$ .

By the same arguments, the probability that Tester probes both $u_{i}$ and $u_{i+2}$ prior to seeing any pair of distance at least $2$ in $H_{1}$ and in $H_{2}$ is $O(q^{2}/n^{2})$ . ∎

To conclude the proof of Theorem 4.1, we note that by (1) and Claims 4.5 and 4.6

[TABLE]

Since $e^{4\varepsilon}\delta\leq 1/4n$ , it follows that $q=\Omega(\sqrt{n}/e^{2\varepsilon})$ . ∎

5 Differentially Oblivious Algorithm for Locating an Object

Given a dataset of objects $\mathbf{x}$ our goal is to locate an object that satisfies a property ${\cal P}$ , if one exists. E.g., given a dataset consisting of employee records, find an employee with income in the range $\$ 35,000 $–$ $70,000$ if such an employee exists in the dataset.

Absent privacy requirements, a simple approach is to probe elements of the dataset in a random order until an element satisfying the property is found or all elements were probed. If a $p$ fraction of the dataset entries satisfy ${\cal P}$ then the expected number of elements sampled by the non-private algorithm is $1/p$ . However, a perfectly oblivious algorithm would require $\Omega(n)$ probes on any dataset, in particular on a dataset where all elements satisfy ${\cal P}$ , where non-privately one probe would suffice. To see why, let ${\cal P}(x)=1$ if $x=1$ and ${\cal P}(x)=0$ otherwise and let $\mathbf{x}$ include exactly one 1-entry in a uniformly random location. Observe that in expectation it requires $\Omega(n)$ memory probes to locate the 1-entry in $\mathbf{x}$ . Perfect obliviousness would hence imply an $\Omega(n)$ probes on any input.

We give a nearly instance optimal differentially oblivious algorithm that always returns a correct answer. Except for probability $e^{-\Omega(mp)}$ the algorithm halts after $m$ steps.

Our Algorithm.

Given the access pattern of the non-private algorithm, an adversary can learn that the last probed element satisfies ${\cal P}$ . To hide this information, we change the stopping condition to having probed at least a (randomly chosen) threshold of elements satisfying ${\cal P}$ . If after $n/2$ probes the number of elements satisfying ${\cal P}$ is below the threshold the entire dataset is scanned. Our algorithm $\mathsf{Locate}_{{\cal P}}$ is described in Figure 4. On a given array $\mathbf{x}$ , algorithm $\mathsf{Locate}_{{\cal P}}$ outputs 1 iff there exists an element in $\mathbf{x}$ satisfying the property ${\cal P}$ .

We remark that Algorithm $\mathsf{Locate}_{{\cal P}}$ uses a mechanism similar to the the sparse vector mechanism of [18]. However, in our case instead of using a single noisy threshold across all steps, Algorithm $\mathsf{Locate}_{{\cal P}}$ generates in each step a noisy threshold $\hat{T}=T+\mathrm{Lap}(\frac{1}{\varepsilon^{\prime}})$ . The value of $T$ ensures that with high probability $\hat{T}>0$ . The proof of Theorem 5.1 is given in Section A.3.

Theorem 5.1.

Algorithm $\mbox{Locate}_{\cal P}$ is an $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious algorithm that outputs 1 iff there exists an element in the array that satisfies property ${\cal P}$ . For $m=\Omega(T/p\log(T/p))$ , with probability $1-e^{-\Omega(mp)}$ it halts in time at most $m$ , where $T=\frac{2\log(2/\delta)}{\varepsilon}\ln\frac{\log n}{\delta}$ .

6 Differentially Oblivious Prefix Sum

Suppose that there is a dataset consisting of sorted sensitive user records, and one would like to compute the sum of all records in the (sorted) dataset that are less than or equal to a value $a$ in a way that respects individual user’s privacy. We call this task differentially oblivious prefix sum. For the definition of privacy we say that two datasets of size $n$ are neighbors if they agree on $n-1$ elements (although, as sorted arrays they can disagree on many indices). For example, $(1,2,3,4)$ and $(1,3,4,5)$ are neighbors and should have similar access pattern.

Without privacy one can find the greatest record less than or equal to value $a$ , and then compute the prefix sum by a quick scan through all records appearing before such record. Any perfectly secure algorithm must read the entire dataset (since it is possible that all elements are smaller than $a$ ). Here, we give a differentially oblivious prefix sum algorithm that for many instances is much faster than any perfectly oblivious algorithm.

Intuition.

Absent privacy requirements, using binary search, one can find the greatest element less than or equal to $a$ , and then compute the prefix sum by a quick scan through all records that appear before such record. However, the binary search access pattern allows the adversary to gain sensitive information about the input. Our main idea is to approximately simulate the binary search and obfuscate the memory accesses to obtain differential obliviousness. In order to do that, we first divide the input array into $k$ chunks (where $k$ is polynomial in $1/\varepsilon,\log 1/\delta$ , and $\log n$ ). Then, we find the chunk that contains the greatest element less than or equal to $a$ by comparing the first element (hence, the smallest element) of each chunk to $a$ . Let $I$ be the index of such chunk. Next, we compute a noisy interval that contains $I$ using the Laplacian distribution. We iteratively repeat this process on the noisy interval, where in each step we eliminate more than a quarter of the elements of the interval. We continue until the size of the array is less than or equal to $k$ . Next, we scan all elements in the remaining array and find the index of the greatest element smaller than or equal to $a$ . Let $i$ be the index of such element; we compute the prefix sum by scanning the array $\mathbf{x}$ until index $i$ .

The Search Algorithm.

We present a search algorithm in Figure 5; on input $\mathbf{x}=(x_{1},\dots,x_{n})$ and $a$ this algorithm finds the largest index $I$ such that $x_{I}\leq a$ . To compute the prefix sum, we compute $\hat{I}=I+\operatorname{{\rm Lap}}(1/\varepsilon)+\frac{\log 1/\delta}{\varepsilon}$ and scan the first $\hat{I}$ elements of the dataset, summing only the first $I$ . We show in Theorem 6.2 that our search algorithm is $(\varepsilon,\delta)$ -differentially oblivious.

Remark 6.1.

We prove that algorithm Search is an $(\varepsilon,0)$ -differentially private algorithm that returns a correct index with probability at least $1-\beta$ . We could change it to an $(\varepsilon,\delta)$ -differentially private algorithm that never errs. This is done by truncating the noise to $\frac{\log 1/\delta^{\prime}}{\varepsilon^{\prime}}$ .

Theorem 6.2.

Let $\beta<1/n$ and $\varepsilon<\log^{2}n$ . Algorithm Search is an $(\varepsilon,0)$ -differentially oblivious algorithm that, for any input array with size $n$ and $a\in{{\mathbb{R}}}$ , returns a correct index with probability at least $1-\beta$ . The running time of Algorithm Search is $O(\frac{1}{\varepsilon}\log^{2}{n}\log{\frac{1}{\beta}})$ .

Theorem 6.2 is proved in Section A.4.

6.1 Dealing with Multiple Queries

We extend our prefix sum algorithm to answer multiple queries. We can answer a bounded number of queries by running the differentially oblivious prefix sum algorithm multiple times. That is, when we want an $(\varepsilon,0)$ -oblivious algorithm correctly answering $t$ queries with probability at least $1-\beta$ , we execute algorithm Search $t$ times with privacy parameter $\varepsilon/t$ and error probability $\beta/t$ (each time also computing the appropriate prefix sum). Thus, the running time of the algorithm is $O(\frac{t^{2}}{\varepsilon}\log^{2}n\log\frac{t}{\beta})$ (excluding the scan time for computing the sum).

On the other hand, we can use an ORAM to answer unbounded number of queries. That is, in a pre-processing stage we store the $n$ records and for each record we store the sum of all records up to this record. Thereafter, answering each query will require one binary search. Using the ORAM of [2], the pre-processing will take time $O(n\log n)$ and answering each query will take time $O(\log^{2}n)$ . Thus, the ORAM algorithm is more efficient when $t\geq\sqrt{n}$ .

We use ORAM along with our differentially oblivious prefix sum algorithm to answer unbounded number of queries while preserving privacy, combining the advantages of both of the previous algorithms.

Theorem 6.3.

Algorithm MultiSearch, described in Figure 6, is an $(\varepsilon,0)$ -oblivious algorithm, which executes Algorithm Search at most $O(\sqrt{n})$ times, where the run time of the $t$ -th execution is $O(\frac{t}{\varepsilon}\log^{3}n\allowbreak\log\frac{n}{\beta})$ , scans the original database at most once, and in addition each query run time is at most $O(\log^{2}n)$ .

Proof.

First note that we only pay for privacy in the executions of algorithm Search (reading and writing to the ORAM is perfectly private). In the $t$ -th execution of algorithm Search, we insert at least $2t$ elements to the ORAM, thus after $\sqrt{n}$ executions we inserted at least $\sum_{t=1}^{\sqrt{n}}2t=n$ elements to the ORAM.

By simple composition, algorithm MultiSearch is $(\varepsilon^{\prime},0)$ -differentially private, where

[TABLE]

where the last inequality is implied by the sum of the harmonic series. ∎

Appendix A Missing Proofs

A.1 Proof of Lemma 2.5

Proof.

Let $\mathbf{x}$ and $\mathbf{y}$ be two neighboring datasets and $S$ be a sets of outputs. Then,

[TABLE]

∎

A.2 Proof of the Correctness and Privacy of Algorithm $\mbox{\sc Tester}_{\cal T}$

Theorem 3.3 is implied by the following lemmas.

Lemma A.1.

Algorithm $\mbox{Tester}_{\cal T}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious.

Proof.

We first analyze a variant of $\mbox{Tester}_{\cal T}$ , denoted by $\mbox{\sc Tester}^{\prime}_{\cal T}$ , in which Step 4 is replaced by “If $c>\hat{T}$ then output $1$ ” (that is, the algorithm does not check if $c>\min\{4T,\hat{T}\}$ before deciding in the positive).

Let $G=(V,E)$ and $G^{\prime}=(V^{\prime},E^{\prime})$ be two neighboring graphs such that they differ on node $v\in V$ . Fix the random choices of subsets $A$ in Step 2b and observe that after the execution of for loop, the count $c$ can differ by at most $1$ between the executions on $G$ and $G^{\prime}$ . Let $\tilde{T}$ be the smallest integer greater than $\hat{T}$ . Since algorithm $\mbox{\sc Tester}^{\prime}_{\cal T}$ uses the Laplace mechanism $e^{-\varepsilon}\Pr[\tilde{T}<a]\leq\Pr[\tilde{T}<a-1]\leq e^{\varepsilon}\Pr[\tilde{T}<a]$ for every $a$ . Thus,

[TABLE]

Similarly, $\Pr[\mbox{\sc Tester}^{\prime}_{\cal T}(G)=1]\geq e^{-\varepsilon}\Pr[\mbox{\sc Tester}^{\prime}_{\cal T}(G^{\prime})=1]$ . Hence, $\mbox{\sc Tester}^{\prime}_{\cal T}$ is $(\varepsilon,0)$ -differentially oblivious.

We next prove that $\mbox{Tester}_{\cal T}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious using Lemma 2.5, that is we prove that for every graph $G$ , the statistical distance between $\mbox{Tester}_{\cal T}(G)$ and $\mbox{\sc Tester}^{\prime}_{\cal T}(G)$ is at most $\delta$ . Let $E$ be the event that $\hat{T}>4T$ and observe that the probability $E$ occurs is at most $\delta$ .666 $\Pr[\mathrm{Lap}(\frac{1}{\varepsilon})\geq t/\varepsilon]=\frac{1}{2}e^{-t}$ for every $t>0$ . Thus, $\Pr[E]=\Pr[\mathrm{Lap}(\frac{1}{\varepsilon})\geq\frac{\ln(1/2\delta)}{\varepsilon}]=\delta$ . We have that $\Big{|}\Pr[\mbox{Tester}_{\cal T}(G)=1]-\Pr[\mbox{\sc Tester}^{\prime}_{\cal T}(G)=1]\Big{|}\leq\Big{|}\Pr[\mbox{Tester}_{\cal T}(G)=1|E]-\Pr[\mbox{\sc Tester}^{\prime}_{\cal T}(G)=1|E]\Big{|}\Pr[E]\leq\Pr[E]\leq\delta$ . Thus, by Lemma 2.5, algorithm $\mbox{Tester}_{\cal T}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious. ∎

Observe that Algorithm $\mbox{Tester}_{\cal T}$ never errs when $G\in{\cal P}$ as in that case after the for loop is executed $c=4T$ and hence in Step 4 $\mbox{Tester}_{\cal T}$ outputs $1$ . The next lemma analyses the error probability when $G$ is $\gamma^{\prime}$ -far from ${\cal P}$ .

Lemma A.2.

Algorithm $\mbox{Tester}_{\cal T}$ is $(1-\delta-(2\delta)^{\frac{1}{3\varepsilon}},\gamma^{\prime})$ -tester for the graph property ${\cal P}$ .

Proof.

Observe that on Step 2c of the algorithm, we are eliminating at most $n\cdot c_{\beta,\gamma}$ edges. Thus, we are eliminating at most $4Tnc_{\beta,\gamma}$ edges in total. Then, when $G$ is $\gamma^{\prime}$ -far from ${\cal P}$ , it is also $\gamma$ -far from ${\cal P}$ after the removal of the observed nodes in each step of the for loop. We next prove that Algorithm $\mbox{Tester}_{\cal T}$ fails with probability at most $2\delta^{\frac{1}{3\varepsilon}}$ . Observe that if Algorithm $\mbox{Tester}_{\cal T}$ fails on $G$ then $c\geq 2T$ or $\mathrm{Lap}(\frac{1}{\varepsilon})\leq-T$ . We define $Z_{i}$ to be output of ${\cal T}(G)$ in the $i$ -th step of the for loop. Let $Z=\sum_{i}{Z_{i}}$ . Observe that all $Z_{i}$ are independent and $\mathbb{E}[Z]\leq T$ . Using the Chernoff Bounds777 $\Pr[Z\geq(1+\eta)\mu]\leq e^{-\eta^{2}\mu/(2+\eta)}$ for any $\eta>0$ where $\mu$ is the expectation of $Z$ ., we obtain that $\Pr[Z\geq 2T]\leq e^{-T/3}=(2\delta)^{\frac{1}{3\varepsilon}}$ . We also know $\Pr[\mathrm{Lap}(\frac{1}{\varepsilon})\leq-\frac{\ln(1/2\delta)}{\varepsilon}]=0.5e^{-\ln(1/2\delta)}=\delta$ . Therefore, Algorithm $\mbox{Tester}_{\cal T}$ fails with probability $\delta+(2\delta)^{\frac{1}{3\varepsilon}}$ . ∎

A.3 Proof of the Correctness and Privacy of Algorithm $\mbox{\sc Locate}_{\cal P}$

The proof of Theorem 5.1 follows from the following claim and lemmas.

Claim A.3.

Let $\ell\geq\log n/\log\log n$ . The probability that there exists an element $j\in[n]$ such that algorithm $\mbox{Locate}_{\cal P}$ samples the element $j$ in Step 2a more than $2\ell$ times is less that $2^{-\ell}.$

Proof.

Fix an index $j$ . The probability that the element $j$ is sampled more than $2\ell$ times is less than $\binom{n/2}{2\ell}\frac{1}{n^{2\ell}}\leq\left(\frac{en}{4\ell}\right)^{2\ell}\frac{1}{n^{2\ell}}<\ell^{-2\ell}<2^{-2\log n+2}<2^{2-\ell}/n.$ The claim follows by the union bound. ∎

Lemma A.4.

Let $\delta<1/n$ . Algorithm $\mbox{Locate}_{\cal P}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious.

Proof.

We first analyze a variant of $\mbox{Locate}_{\cal P}$ , denoted by $\mbox{\sc Locate}^{\prime}_{\cal P}$ , in which Step 2(c)ii is replaced by “If $c>\hat{T}$ then output $1$ ” (that is, the algorithm does not check if $\hat{T}>0$ ) and no element is sampled more than $2\log(2/\delta)$ times. We analyze the privacy of $\mbox{\sc Locate}^{\prime}_{\cal P}(\mathbf{x}^{\prime})$ similarly to the analysis of the sparse vector mechanism in [18].

Let $\mathbf{x}$ and $\mathbf{x}^{\prime}$ be two neighboring datasets that such that ${\cal P}(x_{j})=1$ and ${\cal P}(x^{\prime}_{j})=0$ for some $j$ . Denote by $\tau=(\tilde{T}_{1},\dots,\tilde{T}_{\log n})$ the values of the thresholds in an execution of $\mbox{\sc Locate}^{\prime}_{\cal P}$ , where each threshold is rounded up to the smallest integer greater than $\hat{T}$ . Furthermore, let $\ell_{\tau}\in[\log n]$ be the index such that $\mbox{\sc Locate}^{\prime}_{\cal P}$ on input $\mathbf{x}$ outputs 1 when $i=2^{\ell_{\tau}}$ (if no such $i$ exists, then $\ell_{\tau}\in\lceil\log n\rceil+1$ ). Observe that in each execution of Step 2(c)ii the count $c$ on input $\mathbf{x}$ is at least the count on input $\mathbf{x}^{\prime}$ and can exceed it by at most $2\log(2/\delta)$ (since $j$ is sampled at most $2\log(2/\delta)$ times). Thus, $\mbox{\sc Locate}^{\prime}_{\cal P}$ on input $\mathbf{x}^{\prime}$ with thresholds $\tau^{\prime}=(\tilde{T}_{1},\dots,\tilde{T}_{\ell_{\tau}-1},\tilde{T}_{\ell_{\tau}}-2\log(2/\delta),\tilde{T}_{\ell_{\tau}+1},\dots,\tilde{T}_{\log n})$ outputs 1 when $i=2^{\ell_{\tau}}$ . Since algorithm $\mbox{\sc Locate}^{\prime}_{\cal P}$ uses the Laplace mechanism with $\varepsilon^{\prime}=\varepsilon/(2\log(1/\delta))$ ,

[TABLE]

for every $a$ . Thus,

[TABLE]

Similarly,

[TABLE]

We next prove that $\mbox{Locate}_{\cal P}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious using Lemma 2.5. I.e, we prove that for every dataset $\mathbf{x}$ , the statistical distance between $\mathsf{Access}^{\mbox{Locate}_{\cal P}}(\mathbf{x})$ and $\mathsf{Access}^{\mbox{\sc Locate}^{\prime}_{\cal P}}(\mathbf{x})$ is at most $\delta$ . Notice that if all the thresholds are positive and all elements are sampled at most $2\log(2/\delta)$ times then $\mbox{Locate}_{\cal P}(\mathbf{x})$ and $\mbox{\sc Locate}^{\prime}_{\cal P}(\mathbf{x})$ have the same access pattern. By Claim A.3, the probability that there exists a $j$ that is sampled more than $2\log(2/\delta)$ is at $2^{-\log(2/\delta)}=\delta/2$ . We next observe that the probability that a threshold $\hat{T}=T+\mathrm{Lap}(\frac{1}{\varepsilon^{\prime}})$ is negative is at most $\delta/2$ . Recall that $\Pr[\mathrm{Lap}(\frac{1}{\varepsilon^{\prime}})\leq-t/\varepsilon^{\prime}]=\frac{1}{2}e^{-t}$ for every $t>0$ . Thus, $\Pr[\hat{T}\leq 0]=\Pr[\mathrm{Lap}(\frac{1}{\varepsilon^{\prime}})\leq-\frac{1}{\varepsilon}\ln(\frac{\log n}{\delta})]=\frac{\delta}{2\log n}$ . Let $A$ be the event that at least one of the $\log n$ thresholds $\hat{T}$ is at most 0 or some $j$ is sampled more that $2\log(2/\delta)$ times. By the union bound the probability of $A$ is at most $\delta$ . Therefore, for every set of access patterns $S$

[TABLE]

Thus, by Lemma 2.5, algorithm $\mbox{Locate}_{\cal P}$ is $(\varepsilon,\delta(1+e^{\varepsilon}))$ -differentially oblivious. ∎

We next analyze the running and probe complexity of our algorithm. Let $p$ be the probability that a uniformly chosen element in $\mathbf{x}$ satisfies ${\cal P}$ . The non-private algorithm that samples elements until it finds an element satisfying ${\cal P}$ has expected running time $1/p$ and the probability that it does not stop after $m$ steps is $(1-p)^{m}=((1-p)^{1/p})^{mp}\leq e^{-mp}$ . We show that $\mbox{locate}_{{\cal P}}$ has a similar behavior.

Lemma A.5.

Let $p$ be the probability that a uniformly chosen element in $\mathbf{x}$ satisfies ${\cal P}$ . Then, for every integral power of two $m$ the probability that algorithm $\mbox{locate}_{{\cal P}}$ probes more than $m$ memory locations is less than $\delta/\log n+e^{-(m-2T)p+2T\ln m}$ . In particular, for $m=\Omega(\frac{T}{p}\log(\frac{T}{p}))$ , the probability is less than $\delta/\log n+e^{-O(mp)}$ .

Proof.

Let $t=2^{i}$ . The probability that $\hat{T}\geq 2T$ is $\Pr[\mathrm{Lap}(\frac{1}{\varepsilon^{\prime}})\geq\frac{1}{\varepsilon^{\prime}}\ln\frac{\log n}{\delta}]=0.5e^{-\ln(\log n/\delta)}=\frac{\delta}{\log n}.$ Assuming that $\hat{T}\geq 2T$ , the probability that the algorithm does not halt after $m=2^{i}$ steps is less than

[TABLE]

∎

A.4 Proof of the Correctness and Privacy of Algorithm Search

Theorem 6.2 is proved in the next 3 claims. We start by analyzing the running time of the algorithm.

Claim A.6.

Let $\beta<1/n$ and $\varepsilon<\log^{2}n$ . The while loop in Algorithm Search is executed at most $2.5\log n$ time. Furthermore, the total running time of the algorithm is $O(\frac{1}{\varepsilon}\log^{2}{n}\log{\frac{1}{\beta}})$ .

Proof.

Let $\min_{0},\max_{0}$ and $\min_{1},\max_{1}$ be the values of $\min,\max$ before and after an execution of a step of the while loop in Algorithm Search. Note that

[TABLE]

Therefore, algorithm Search eliminates more than a quarter of the elements in each step of the while loop and the algorithm will halt after less than $2.5\log n$ steps.

Moreover, observe that Algorithm Search makes $k$ memory accesses in each step of the while loop and additional $k$ memory accesses after the loop. Thus, its running time is $O(\frac{1}{\varepsilon}\log^{2}{n}(\log{\log{n}}+\log{\frac{1}{\beta}}))=O(\frac{1}{\varepsilon}\log^{2}{n}\log{\frac{1}{\beta}})$ (since $\beta<1/n$ ). ∎

Claim A.7.

Algorithm Search returns the correct index with probability at least $1-\beta$ .

Proof.

Let $\bar{I}$ be the maximal index such that $x_{\bar{I}}\leq a$ (i.e., $\bar{I}$ is the index that algorithm Search should return). We prove by induction that if all Laplace noises in the algorithm satisfy $|\operatorname{{\rm Lap}}(\frac{1}{\varepsilon^{\prime}})|<\frac{\log{1/\beta^{\prime}}}{\varepsilon^{\prime}}$ then in each step of the algorithm $\min\leq\bar{I}\leq\max$ , hence the algorithm will return $\bar{I}$ in its last scan of $\mathbf{x}$ between $\min$ and $\max$ .

The basis of the induction is trivial since $0\leq\bar{I}\leq n$ . For the induction step, let $\min_{0},\max_{0}$ and $\min_{1},\max_{1}$ be the values of $\min,\max$ before and after an execution of a step of the while loop in Algorithm Search. By the induction hypothesis, $\min_{0}\leq\bar{I}\leq\max_{0}$ . The algorithm finds an index $I$ such that $\min_{0}+Ic\leq\bar{I}\leq\min_{0}+(I+1)c$ . By our assumption on the Laplace noise, $\min_{1}\leq\min_{0}+Ic$ , thus, $\min_{1}\leq\bar{I}$ . Similarly, $\max_{1}\geq\min_{0}+(I+1)c$ , thus, $\max_{1}\geq\bar{I}$ .

Recall that $\Pr[|\operatorname{{\rm Lap}}(\frac{1}{\varepsilon^{\prime}})|\geq t/\varepsilon^{\prime}]=e^{-t}$ for every $t>0$ . Thus, by Claim A.6 and the union bound, the probability that one of the Laplace noises is greater than $\frac{\log{1/\beta^{\prime}}}{\varepsilon^{\prime}}$ is at most $(2.5\log n)\cdot\beta^{\prime}=\beta$ . Hence, the probability that algorithm Search returns the correct index $\bar{I}$ is at least $1-\beta$ . ∎

Next, we show that algorithm Search is $(\varepsilon,0)$ -differentially oblivious.

Claim A.8.

Algorithm Search is an $(\varepsilon,0)$ deferentially oblivious algorithm.

Proof.

We show below that each step of the while loop in algorithm Search is $(\varepsilon^{\prime},0)$ -differentially oblivious. Applying the basic composition theorem and Claim A.6, we obtain that the Search algorithm is $(\varepsilon=(2.5\log n)\varepsilon^{\prime},0)$ -differentially oblivious.

Fix a step of the loop and view it as an algorithm that returns $\min$ and $\max$ (given these values the access pattern of the next step is fixed). Let $\mathbf{x}$ and $\mathbf{x}^{\prime}$ be two neighboring datasets such that for some $j$ we have $x_{j}>x^{\prime}_{j}$ and $x_{i}=x^{\prime}_{i}$ for all $i<j$ . It holds that $x_{i-1}\leq x^{\prime}_{i}\leq x_{i}$ for every $i$ . Let $I(\mathbf{x})$ and $I(\mathbf{x}^{\prime})$ be the values computed in step 2c of the algorithm on inputs $\mathbf{x}$ and $\mathbf{x}^{\prime}$ respectively. Thus, the value $I(\mathbf{x})$ is at least the value $I(\mathbf{x}^{\prime})$ and can exceed it by one. Intuitively, since algorithm Search uses the Laplace mechanism, the probabilities of returning a value $\min$ on $\mathbf{x}$ and $\mathbf{x}^{\prime}$ are at most $e^{\pm\varepsilon^{\prime}}$ apart. Formally, if $\operatorname{{\rm Lap}}(1/\varepsilon^{\prime})+I(\mathbf{x})=\operatorname{{\rm Lap}}(1/\varepsilon^{\prime})+I(\mathbf{x})$ (where we consider two independent noises), then the algorithm returns the same value of $\min$ on both inputs. The lemma follows since for every set $A$ :

[TABLE]

∎

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Noga Alon, Eldar Fischer, Michael Krivelevich, and Mario Szegedy. Efficient testing of large graphs. Combinatorica , 20(4):451–476, 2000.
2[2] Gilad Asharov, Ilan Komargodski, Wei-Kai Lin, Kartik Nayak, and Elaine Shi. Optorama: Optimal oblivious RAM. IACR Cryptology e Print Archive , 2018:892, 2018.
3[3] Marina Blanton, Aaron Steele, and Mehrdad Aliasgari. Data-oblivious graph algorithms for secure computation and outsourcing. In Kefei Chen, Qi Xie, Weidong Qiu, Ninghui Li, and Wen-Guey Tzeng, editors, 8th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’13 , pages 207–218. ACM, 2013.
4[4] David Cash, Paul Grubbs, Jason Perry, and Thomas Ristenpart. Leakage-abuse attacks against searchable encryption. In Indrajit Ray, Ninghui Li, and Christopher Kruegel, editors, Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015 , pages 668–679. ACM, 2015.
5[5] T.-H. Hubert Chan, Kai-Min Chung, Bruce M. Maggs, and Elaine Shi. Foundations of differentially oblivious algorithms. In Timothy M. Chan, editor, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019 , pages 2448–2467. SIAM, 2019.
6[6] Srinivas Devadas, Marten van Dijk, Christopher W. Fletcher, Ling Ren, Elaine Shi, and Daniel Wichs. Onion ORAM: A constant bandwidth blowup oblivious RAM. In Eyal Kushilevitz and Tal Malkin, editors, Theory of Cryptography - 13th International Conference, TCC 2016-A , volume 9563 of Lecture Notes in Computer Science , pages 145–174. Springer, 2016.
7[7] Cynthia Dwork, Frank Mc Sherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Shai Halevi and Tal Rabin, editors, Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006 , volume 3876 of Lecture Notes in Computer Science , pages 265–284. Springer, 2006.
8[8] David Eppstein, Michael T. Goodrich, and Roberto Tamassia. Privacy-preserving data-oblivious geometric algorithms for geographic data. In Divyakant Agrawal, Pusheng Zhang, Amr El Abbadi, and Mohamed F. Mokbel, editors, 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2010 , pages 13–22. ACM, 2010.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Exploring Differential Obliviousness††thanks: Work supported by NSF grant No. 1565387 TWC: Large: Collaborative: Computing Over Distributed Sensitive Data.

1 Introduction

1.1 Differential Obliviousness

1.2 This Work: Exploring Differential Obliviousness

Property testing.

Locating an Object Satisfying a Property.

Prefix Sum.

1.3 Background Work

2 Definitions

2.1 Model of Computation

2.2 Oblivious Algorithms

Definition 2.1** (Fully Oblivious Algorithms).**

2.3 Differentially Oblivious Algorithms

Definition 2.2** (Neighbor-respecting).**

Definition 2.3**.**

Remark 2.4**.**

Lemma 2.5**.**

3 Differentially Oblivious Property Testing of Dense Graphs Properties

Definition 3.1** ([12]).**

Example 3.2** ([12]).**

Theorem 3.3**.**

4 Lower Bounds on Testing Connectivity

Theorem 4.1**.**

Proof.

Observation 4.2**.**

Claim 4.3**.**

Proof.

Claim 4.4**.**

Proof.

Claim 4.5**.**

Proof.

Claim 4.6**.**

Proof.

5 Differentially Oblivious Algorithm for Locating an Object

Our Algorithm.

Theorem 5.1**.**

6 Differentially Oblivious Prefix Sum

Intuition.

The Search Algorithm.

Remark 6.1**.**

Theorem 6.2**.**

6.1 Dealing with Multiple Queries

Theorem 6.3**.**

Proof.

Appendix A Missing Proofs

A.1 Proof of Lemma 2.5

Proof.

A.2 Proof of the Correctness and Privacy of Algorithm \mbox\scTesterT\mbox{\sc Tester}_{\cal T}\mbox\scTesterT​

Lemma A.1**.**

Proof.

Lemma A.2**.**

Proof.

A.3 Proof of the Correctness and Privacy of Algorithm \mbox\scLocateP\mbox{\sc Locate}_{\cal P}\mbox\scLocateP​

Claim A.3**.**

Proof.

Lemma A.4**.**

Proof.

Lemma A.5**.**

Proof.

A.4 Proof of the Correctness and Privacy of Algorithm Search

Claim A.6**.**

Proof.

Claim A.7**.**

Proof.

Claim A.8**.**

Proof.

Definition 2.1 (Fully Oblivious Algorithms).

Definition 2.2 (Neighbor-respecting).

Definition 2.3.

Remark 2.4.

Lemma 2.5.

Definition 3.1 ([12]).

Example 3.2 ([12]).

Theorem 3.3.

Theorem 4.1.

Observation 4.2.

Claim 4.3.

Claim 4.4.

Claim 4.5.

Claim 4.6.

Theorem 5.1.

Remark 6.1.

Theorem 6.2.

Theorem 6.3.

A.2 Proof of the Correctness and Privacy of Algorithm $\mbox{\sc Tester}_{\cal T}$

Lemma A.1.

Lemma A.2.

A.3 Proof of the Correctness and Privacy of Algorithm $\mbox{\sc Locate}_{\cal P}$

Claim A.3.

Lemma A.4.

Lemma A.5.

Claim A.6.

Claim A.7.

Claim A.8.