This paper introduces a new existential approximate lifting called $ ext{ extsterling}$-lifting, unifying existing notions and enhancing the tools for proving differential privacy through cleaner properties and more general constructions.
Contribution
It proposes a novel $ ext{ extsterling}$-lifting that unifies existing approximate liftings, providing more precise composition theorems and cleaner properties for differential privacy proofs.
Findings
01
$ ext{ extsterling}$-lifting is equivalent to Sato's universal construction for discrete measures
02
Unifies all known notions of approximate lifting
03
Enables richer and more precise differential privacy proofs
Abstract
Recent developments in formal verification have identified approximate liftings (also known as approximate couplings) as a clean, compositional abstraction for proving differential privacy. This construction can be defined in two styles. Earlier definitions require the existence of one or more witness distributions, while a recent definition by Sato uses universal quantification over all sets of samples. These notions have each have their own strengths: the universal version is more general than the existential ones, while existential liftings are known to satisfy more precise composition principles. We propose a novel, existential version of approximate lifting, called ⋆-lifting, and show that it is equivalent to Sato's construction for discrete probability measures. Our work unifies all known notions of approximate lifting, yielding cleaner properties, more general…
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Recent developments in formal verification have identified approximate
liftings (also known as approximate couplings) as a clean,
compositional abstraction for proving differential privacy. This construction
can be defined in two styles. Earlier definitions require the
existence of one or more witness distributions, while a recent definition by
Sato uses universal quantification over all sets of samples. These notions
each have their own strengths: the universal version is more general
than the existential ones, while existential liftings are known to satisfy
more precise composition principles.
We propose a novel, existential version of approximate lifting, called
⋆-lifting, and show that it is equivalent to Sato’s construction
for discrete probability measures. Our work unifies all known notions of
approximate lifting, yielding cleaner properties, more general constructions,
and more precise composition theorems for both styles of lifting, enabling
richer proofs of differential privacy. We also clarify the relation between
existing definitions of approximate lifting, and consider more general
approximate liftings based on f-divergences.
Key words and phrases:
probabilistic, relational lifting, differential privacy, max flow/min cut
1. Introduction
Differential privacy [Dwork et al., 2006] is a strong, rigorous notion of statistical
privacy. Informally, differential privacy guarantees to every individual that
their participation in a database query will have a quantitatively small effect
on the query results, limiting the amount that the query answer depends on their
private data. The definition of differential privacy is parametrized by two
non-negative real numbers, (ε,δ), which quantify the effect of
individuals on the output of the private query: smaller values give stronger
privacy guarantees. The main strengths of differential privacy lie in its
theoretical elegance, minimal assumptions, and flexibility.
Recently, programming language researchers have developed approaches based on
dynamic analysis, type systems, and program logics for formally proving
differential privacy for programs. (We refer the interested reader to a recent
survey [Barthe et al., 2016c] for an overview of this growing field.) In this
paper, we consider approaches based on relational program
logics [Barthe et al., 2013, Barthe and Olmedo, 2013, Olmedo, 2014, Barthe et al., 2016a, b, Sato, 2016].
To capture the quantitative nature of differential privacy, these systems rely
on a quantitative generalization of probabilistic couplings (see, e.g.,
[Lindvall, 2002, Thorisson, 2000, Villani, 2008]), called approximate liftings or
(ε,δ)-liftings. Prior works have considered several potential
definitions. While all definitions support compositional reasoning and enable
program logics that can verify complex examples from the privacy literature, the
various notions of approximate liftings have different strengths and weaknesses.
Broadly speaking, the first class of definitions require the existence of one or
two witness distributions that “couple” the output distributions from
program executions on two related inputs (intuitively, the true database and the
true database omitting one individual’s record). The earliest
definition [Barthe et al., 2013] supports accuracy-based reasoning for the Laplace
mechanism, while subsequent definitions [Barthe and Olmedo, 2013, Olmedo, 2014] support
more precise composition principles from differential privacy and can be
generalized to other notions of distance on distributions. These definitions,
and their associated program logics, were designed for discrete distributions.
In the course of extending these ideas to continuous distributions,
Sato [2016] proposes a radically different notion of approximate lifting that
does not rely on witness distributions. Instead, it uses a universal
quantification over all sets of samples. Sato shows that this definition is
strictly more general than the existential versions, but it is unclear (a)
whether the gap can be closed and (b) whether his construction satisfies the
same composition principles enjoyed by some existential definitions.
As a consequence, no single definition is known to satisfy the properties needed
to support all existing formalized proofs of differential privacy. Furthermore,
some of the most involved privacy proofs cannot be formalized at all, as their
proofs require a combination of constructions satisfied by existential or
universal liftings, but not both.
Outline of the paper
After introducing mathematical preliminaries in Section 2, we introduce
our main technical contribution: a new, existential definition of approximate
lifting. This construction, which we call ⋆-lifting, is a
generalization of an existing definition by Barthe and Olmedo [2013], Olmedo [2014]. The
key idea is to slightly enlarge the domain of witness distributions with a
single generic point, broadening the class of approximate liftings. By a maximum
flow/minimum cut argument, we show that ⋆-liftings are equivalent to
Sato’s lifting over discrete distributions. This equivalence can be viewed as an
approximate version of Strassen’s theorem [Strassen, 1965], a
classical result in probability theory characterizing the existence of
probabilistic couplings. We present our definition and the proof of equivalence
in Section 3.
Then, we show that ⋆-liftings satisfy desirable theoretical properties by
leveraging the equivalence of liftings in two ways. In one direction, Sato’s
definition gives simpler proofs of more general properties of ⋆-liftings.
In the other direction, ⋆-liftings—like previously proposed existential
liftings—can smoothly incorporate composition principles from the theory of
differential privacy. In particular, our connection shows that Sato’s definition
can use these principles in the discrete case. We describe the key theoretical
properties of ⋆-liftings in Section 4.
Finally, we provide a thorough comparison of ⋆-lifting with other existing
definitions of approximate lifting in Section 5, introduce a
symmetric version of ⋆-lifting that satisfies the so-called advanced
composition theorem from differential privacy Dwork et al. [2010] in
Section 6, and generalize ⋆-liftings to approximate liftings
based on f-divergences in Section 7.
Overall, the equivalence of ⋆-liftings and Sato’s lifting, along with the
natural theoretical properties satisfied by the common notion, suggest that
these definitions are two views on the same concept: an approximate version of
probabilistic coupling.
2. Background
To model probabilistic data, we work with discrete sub-distributions.
{defi}
A sub-distribution over a set A is defined by its
mass function μ:A→[0,1], which gives the probability of the
singleton events a∈A. This mass function must be s.t.
∣μ∣=△∑a∈Aμ(a) is well-defined and at most 1.
In particular, the supportsupp(μ)=△{a∈A∣μ(a)=0}
must be discrete (i.e. finite or countably infinite).
When the weight∣μ∣ is equal to 1, we call μ a
(proper) distribution. We let D(A) denote the set of
sub-distributions over A.
EventsE are predicates on A; the probability of an event E(x)
w.r.t. μ, written Px∼μ[E(x)] or Pμ[E], is defined as
∑x∈A∣E(x)μ(x).
Simple examples of sub-distributions include the null
sub-distribution0A∈D(A), which maps each element
of A to [math], and the Dirac distribution centered on x,
written 1x, which maps x to 1 and all other elements to
[math].
One can equip distributions with the usual monadic structure using the Dirac
distributions 1x for the unit and distribution
expectationEx∼μ[f(x)] for the bind; if μ is a
distribution over A and f has type A→D(B), then the bind
defines a sub-distribution over B via
Ea∼μ[f(a)]:b↦∑aμ(a)⋅f(a)(b).
If f:A→B, we can lift f to a function f♯:D(A)→D(B) as
f♯(μ)=△Ea∼μ[1f(a)] ;
more explicitly,
f♯(μ):b↦Pa∼μ[a∈f−1(b)].
For instance, when working with sub-distributions over pairs, we have the
probabilistic versions π1♯:D(A×B)→D(A) and π2♯:D(A×B)→D(B) (called the marginals) of the usual
projections π1 and π2 by lifting. One can check that the first
and second marginalsπ1♯(μ) and π2♯(μ) of a distribution
μ over A×B are given by the following equations:
π1♯(μ)(a)=∑b∈Bμ(a,b) and
π2♯(μ)(b)=∑a∈Aμ(a,b).
When f:A→D(B), we will abuse notation and write the lifting f♯:D(A)→D(B) to mean
f♯(μ)=△Ex∼μ[f(x)];
this is sometimes called the Kleisli extension of f.
Finally, we will often consider sums of weight functions over sets. If α:A→R≥0 maps A to the non-negative real numbers, we write α[X]∈R≥0∪{∞} for ∑x∈Xα(x). Moreover, if α:A×B→R≥0, we write α[X,Y] (resp. α[x,Y], α[X,y]) for α[X×Y] (resp. α[{x}×Y, α[X×{y}]). Note that for a sub-distribution μ∈D(A) and an
event E⊆A, Pμ[E]=μ[E].
We now review the definition of differential privacy.
{defi}
[Dwork et al. [2006]]
Let ε,δ≥0 be real parameters. A probabilistic computation
M:A→D(B) satisfies (ε,δ)-differential privacy
w.r.t. an adjacency relation Phys.Rev.E⊆A×A if for every pair of
inputs (a,a′)∈Phys.Rev.E and every subset of outputs E⊆B, we have
[TABLE]
Differential privacy is closely related to a relaxed version of
distance—technically, an f-divergence—on distributions.
{defi}
[Barthe et al. [2013], Barthe and Olmedo [2013], Olmedo [2014]]
Let ε≥0. The ε-DP divergenceΔε(μ1,μ2) between two sub-distributions
μ1,μ2∈D(B) is defined as
[TABLE]
Then, differential privacy admits an alternative characterization based on
DP divergence.
Lemma 1**.**
A probabilistic computation M:A→D(B) satisfies
(ε,δ)-differential privacy w.r.t. an
adjacency relation Phys.Rev.E⊆A×A iff
Δε(M(a),M(a′))≤δ for every pair of
inputs (a,a′)∈Phys.Rev.E.
Our new definition of approximate lifting is inspired by a version of
approximate liftings involving two witness distributions, proposed by
Barthe and Olmedo [2013], Olmedo [2014].
{defi}
[Barthe and Olmedo [2013], Olmedo [2014]]
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R≥0 and R be a binary relation over
A and B. An (ε,δ)-approximate2-lifting of μ1 and μ2 for R is a pair
(μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳) of sub-distributions over A×B s.t.
(1)
π1♯(μ\mspace−1.5mu⊲)=μ1 and π2♯(μ\mspace−1.5mu⊳)=μ2;
2. (2)
Δε(μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳)≤δ; and
3. (3)
supp(μ\mspace−1.5mu⊲)⊆R and supp(μ\mspace−1.5mu⊳)⊆R.
We write
μ1Rε,δ(2)μ2
if there exists an (ε,δ)-approximate (2-)lifting of μ1 and μ2 for R; the superscript ⋅(2) indicates that
there are two witnesses μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ in this definition of
lifting.
Combined with Lemma 1, a probabilistic computation M:A→D(B) is (ε,δ)-differentially private if and only if for
every two adjacent inputs aϕa′, there is an approximate
lifting of the equality relation: M(a)=ε,δ(2)M(a′).
2-liftings can be generalized by varying the notion of distance given by
Δε; we will return to this point in Section 7. These
liftings also satisfy useful theoretical properties, but some of the properties
are not as general as we would like. For example, it is known that 2-liftings
satisfy the following mapping property.
Let μ1∈D(A1), μ2∈D(A2) be distributions
f1:A1→B1, f2:A2→B2 be surjective maps and R
be a binary relation on B1 and B2. Then
[TABLE]
where
a1Sa2⟺△f1(a1)Rf2(a2).
This property can be used to pull back an approximate lifting on two
distributions over B1,B2 to an approximate lifting on two distributions
over A1,A2. For applications in program logics, B1,B2 could be the
domain of a program variable, A1,A2 could be the set of memories, and f1,f2 could project a memory to a program variable. While the mapping theorem is
quite useful, it is puzzling why it only applies to surjective maps. For
instance, this theorem cannot be used when the maps f1,f2 inject a smaller
space into a larger space.
For another example, there exist 2-liftings of the following form, sometimes
called the optimal subset coupling.
Let μ∈D(A) and consider two subsets P1⊆P2⊆A. Suppose that P2 is a strict subset of A. Then, we have the
following equivalence:
[TABLE]
where
a1Ra2⟺△a1∈P1⟺a2∈P2.
In this construction, it is puzzling why the larger subset P2 must be a
strict subset of the domain A. For example, this theorem does not apply
for P2=A, but we may be able to construct the approximate lifting if we
simply embed A into a larger space A′—even though μ has support over
A! Furthermore, it is not clear why the subsets must be nested, nor is it
clear why we can only relate μ to itself.
These shortcomings suggest that the definition of 2-liftings may be
problematic. While the distance condition appears to be the most constraining
requirement, the marginal and support conditions are responsible for the main
issues.
Witnesses must have support in the relation R.
For some relations R, there may be elements a such that aRb does not hold for any b, or vice versa. It can be impossible to find
witnesses with the correct marginals on these elements while satisfying the
support condition, even if the distance condition is satisfied. At a high level,
there are situations where it is possible to construct a pair μ\mspace−1.5mu⊲ and
μ\mspace−1.5mu⊳ satisfying the distance requirement, but where μ\mspace−1.5mu⊳ needs
additional mass to achieve the marginal requirement for some element b. Adding
this mass anywhere preserves the distance bound between μ\mspace−1.5mu⊲ and
μ\mspace−1.5mu⊳—since it only increases the mass of μ\mspace−1.5mu⊳ while preserving
the mass of μ\mspace−1.5mu⊲, and the distance bound is asymmetric—but if there is
no element a such that aRb then we cannot place this mass
within the required support, and μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ cannot be related
by an approximate lifting.
No canonical choice of witnesses.
A related problem is that the marginal requirement only constrains one marginal
of each witness distribution. Along the other component, the witnesses may place
the mass anywhere on any pair in the relation. As a result, witnesses to an
approximate lifting μ1Rε,δ(2)μ2 are
sometimes required to place mass outside of supp(μ1)×supp(μ2),
even though intuitively only elements in the support of the related
distributions should be relevant to the lifting. This theoretical flaw makes it
difficult to establish basic mapping and support properties of approximate
liftings.
{exa}
We illustrate these problems with a concrete example. Consider the geometric
distribution with parameter p=1/2, a distribution over the natural numbers
which models the distribution of number of flips of a fair coin before the
coin first comes up tails. Formally, the distribution γ∈D(N)
is defined by γ(k)=1/2k+1. Consider the binary relation R={(x1,x2)∣x1+1=x2} over N. Now, γ cannot be
related to itself via an approximate lifting γRε,δ(2)γ for any parameters ε,δ. To see why, the second witness μ\mspace−1.5mu⊳ must satisfy the second
marginal condition at k=0, so it must put total weight γ(0)=1/2
on pairs of the form (−,0). These pairs must belong to the relation R,
but there is no x1∈N such that x1+1=0. However, there
is an approximate lifting γRln(2),0(2)γ, where γ and
R+1 are extended to the integers Z. For instance, the two joint
distributions with support μ\mspace−1.5mu⊲(z,z+1)=1/2z+1 for z≥0,
and μ\mspace−1.5mu⊳(z,z+1)=1/2z+2 for z≥−1 form witnesses.
This behavior is a sign that the notion of approximate lifting is not
well-behaved: the support of γ remains the non-negative
integers, but somehow embedding γ into a larger space enables
additional approximate liftings.
3. ⋆-Liftings and Strassen’s Theorem
To improve the theoretical properties of 2-liftings, we propose a simple
extension: allow witnesses to be distributions over a larger set.
{nota}
For a set A, we write A⋆ for A⊎{⋆}. For a
distribution η∈D(C) and a subset C′⊆C, we write
η∣C′ for the restriction of η to C′, i.e., the
sub-distribution given by η∣C′(c)=η(c) for c∈C′, and
η∣C′(c)=0 otherwise.
{defi}
[⋆-lifting]
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R≥0 and R be a binary relation over A
and B. An (ε,δ)-approximate⋆-lifting of
μ1 and μ2 for R is a pair of sub-distributions
η\mspace−1.5mu⊲∈D(A×B⋆) and
η\mspace−1.5mu⊳∈D(A⋆×B) s.t.
(1)
π1♯(η\mspace−1.5mu⊲)=μ1 and π2♯(η\mspace−1.5mu⊳)=μ2;
2. (2)
supp(η\mspace−1.5mu⊲∣A×B),supp(η\mspace−1.5mu⊳∣A×B)⊆R; and
3. (3)
Δε(η\mspace−1.5mu⊲,η\mspace−1.5mu⊳)≤δ,
where η∙ is the extension of
η∙ to D(A⋆×B⋆) given by the evident
inclusions from A×B⋆ and A⋆×B to A⋆×B⋆.
We write
μ1Rε,δ(⋆)μ2
if there exists an (ε,δ)-approximate ⋆-lifting of
μ1 and μ2 for R.
By adding an element ⋆, we address both problems discussed at
the end of the previous section. First, for every a∈A witnesses may
place mass at (a,⋆), and for every b∈B witnesses may place
mass at (⋆,b). Second, ⋆ serves as a generic element
where all mass outside the supports supp(μ1)×supp(μ2) may located, giving more control over the form of the witnesses.
Formally, ⋆-liftings satisfy the following natural support property.
Lemma 4**.**
Let μ1∈D(A) and μ2∈D(B) be distributions such that
μ1Rε,δ(⋆)μ2 .
Then, there are witnesses with support contained in supp(μ1)⋆×supp(μ2)⋆.
Let μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ be any pair of witnesses to the approximate
lifting. We will construct witnesses η\mspace−1.5mu⊲,η\mspace−1.5mu⊳ with the
desired support. For ease of notation, let Si=△supp(μi) for
i∈{1,2}. Define:
[TABLE]
Evidently, η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ have support in S1⋆×S2⋆. Additionally, it is straightforward to check that
π1♯(η\mspace−1.5mu⊲)=π1♯(μ\mspace−1.5mu⊲)=μ1 and π2♯(η\mspace−1.5mu⊳)=π2♯(μ\mspace−1.5mu⊳)=μ2 so η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ have the desired
marginals.
It only remains to check the distance condition. By the definition of the
distance Δε, we know that there are non-negative values
δ(a,b) such that (i) μ\mspace−1.5mu⊲(a,b)≤eεμ\mspace−1.5mu⊳(a,b)+δ(a,b) and (ii) ∑a,bδ(a,b)≤δ. We can define new constants:
[TABLE]
Since μ\mspace−1.5mu⊲(⋆,b)=η\mspace−1.5mu⊲(⋆,b)=0
for all b∈B⋆, and μ\mspace−1.5mu⊳(a,b)=η\mspace−1.5mu⊳(a,b)=0 for all b∈/S2, point (i) holds for
the witnesses η\mspace−1.5mu⊲,η\mspace−1.5mu⊳ and constants ζ(a,b). Since
∑a,bζ(a,b)=∑a,bδ(a,b)≤δ, point (ii)
holds as well. Hence, Δε(η\mspace−1.5mu⊲,η\mspace−1.5mu⊳)≤δ and we have witnesses for the desired
approximate lifting.
3.1. Basic Properties
⋆-liftings satisfy key properties enjoyed by existing notions of
approximate lifting. To start, ⋆-liftings characterize differential
privacy.111The proofs of the next two lemmas use an equivalence that we will soon prove
in Theorem 9. This is purely for convenience—these proofs
could also be performed separately, and in any case Theorem 9
does not use these lemmas so there is no circularity.
Lemma 5**.**
A randomized algorithm P:A→D(B) is
(ε,δ)-differentially private w.r.t. Phys.Rev.E if for all
(a1,a2)∈Phys.Rev.E we have
P(a1)=ε,δ(⋆)P(a2).
(Here, we are turning the equality relation on B to an approximate lifting
relating distributions over B.)
Let P be (ε,δ)-differentially private w.r.t. Phys.Rev.E
and let (a1,a2)∈Phys.Rev.E. Let X be a subset of B. By
definition of differential privacy, we have
P(a1)[X]≤eε⋅P(a2)[X]+δ=eε⋅P(a2)[(=)(X)]+δ.
Recall that the image of a set X under a binary relation is simply the set
of all elements related to some element in X. In particular, (=)(X) is
just X. Hence, by application of Theorem 9, we have P(a1)=ε,δ(⋆)P(a2).
**(⟸): **
By application of Theorem 9, we have that
[TABLE]
This is the definition of P being
(ε,δ)-differentially private w.r.t. Phys.Rev.E.
The next lemma establishes several other basic properties of ⋆-liftings:
monotonicity, and closure under relational and sequential composition.
Lemma 6**.**
•
Let μ1∈D(A), μ2∈D(B), and R
be a binary relation over A and B. If
μ1Rε,δ(⋆)μ2, then for any
ε′≥ε, δ′≥δ and
S⊇R, we have
μ1Sε′,δ′(⋆)μ2.
•
Let μ1∈D(A), μ2∈D(B),
μ3∈D(C) and R (resp. S) be a binary relation
over A and B (resp. over B and C). If
μ1Rε,δ(⋆)μ2 and
μ2Sε′,δ′(⋆)μ3,
then
μ1(S∘R)ε+ε′,δ+eε⋅δ′(⋆)μ3.
•
For i∈{1,2}, let μi∈D(Ai) and
ηi:Ai→D(Bi). Let R (resp. S) be a binary
relation over A1 and A2 (resp. over B1 and B2). If
μ1Rε,δ(⋆)μ2
for some ε,δ≥0 and for any
(a1,a2)∈R, we have
η1(a1)Sε′,δ′(⋆)η2(a2)
for some ε′,δ′≥0,
then
Let
ε=△ε+ε′ and
δ=△δ+eε⋅δ′.
By Theorem 9, it is sufficient to show that
μ1(X)≤eε⋅μ1(S(R(X)))+δ
for any set X. We have:
[TABLE]
•
We know that
∃⟨μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳⟩◀ε,δR⟨μ1&μ2⟩.
Likewise, for a=△(a1,a2)∈R,
∃⟨η⊲,a,η⊳,a⟩◀ε′,δ′S⟨η1(a1)&η2(a2)⟩.
Let η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ be the following distribution
constructors:
[TABLE]
and let
ξ\mspace−1.5mu⊲=△Eμ\mspace−1.5mu⊲[η\mspace−1.5mu⊲]
(resp. ξ\mspace−1.5mu⊳=△Eμ\mspace−1.5mu⊳[η\mspace−1.5mu⊳]).
We now prove that:
[TABLE]
The marginal and support conditions are immediate. The distance
condition is obtained by an immediate application of the previous
point.
3.2. Equivalence with Sato’s Definition
In recent work on verifying differential privacy over continuous distributions,
Sato [2016] proposes an alternative definition of approximate lifting. In the
special case of discrete distributions—where all events are measurable—his
definition can be stated as follows.
{defi}
[Sato [2016]]
Let μ1∈D(A) and μ2∈D(B), R be a binary relation
over A and B and ε,δ≥0. The distributions
μ1 and μ2 are related by a (witness-free)
(ε,δ)-approximate lifting for R if
[TABLE]
We write R(X)={b∈B∣∃a∈X.(a,b)∈R}⊆B.
Notice that this definition has no witness distributions at all; instead, it
uses a universal quantifier over all subsets. We can show that
⋆-liftings are equivalent to Sato’s definition in the case of
discrete distributions. This equivalence is reminiscent of Strassen’s
theorem from probability theory, which characterizes the existence of
probabilistic couplings.
Let μ1∈D(A), μ2∈D(B) be two proper distributions,
and let R be a binary relation over A and B. Then there exists
a joint distribution μ∈D(A×B) with support in R such
that π1♯(μ)=μ1 and π2♯(μ)=μ2 if and only if
[TABLE]
Our result (Theorem 9) can be viewed as a generalization of
Strassen’s theorem to approximate couplings. The key ingredient in our proof is
the max-flow min-cut theorem; we begin by reviewing the basic setting.
{defi}
[Flow network]
A flow network is a structure ((V,E),⊤,⊥,c)
s.t. N=(V,E) is a loop-free directed graph without
infinite simple path (or rays), ⊤ and ⊥ are two distinct
distinguished vertices of N s.t. no edge starts from
⊥ and ends at ⊤, and
c:E→R+∪{+∞}
is a function assigning to each edge of N a capacity. The capacity
c is extended to V2 by assigning capacity [math] to any pair
(u,v) s.t. (u,v)∈/E.
{defi}
[Flow]
Given a flow network N=△((V,E),⊤,⊥,c),
a function f:V2→R is a flow for N
iff
(1)
∀u,v∈V.f(u,v)≤c(u,v),
2. (2)
∀u,v∈V.f(u,v)=−f(v,u), and
3. (3)
∀u∈V.u∈/{⊤,⊥}⟹∑v∈Vf(u,v)=0
(Kirchhoff’s Law).
The mass∣f∣ of a flow f is defined as
∣f∣=△∑v∈Vf(⊤,v)∈R{∪+∞}.
{defi}
[Cut]
Given a flow network N=△((V,E),⊤,⊥,c),
a cut for N is any set C⊆V that
partition V s.t. ⊤∈V but ⊥∈/V. The
cut-setE(C)⊆E of a cut C is the set of edges
crossing the cut:
{(u,v)∈E∣u∈C,v∈/C}.
The capacity∣C∣∈R≥0∪{∞} of a cut C is defined as
∣C∣=△∑(u,v)∈E(C)c(u,v).
For finite flow networks, the maximum flow is equal to the minimum cut (see,
e.g., Kleinberg and Tardos [2005]).
Aharoni et al. [2011] generalize this theorem to networks with
countable vertices and edges under certain conditions. We will use a consequence
of their result.
Theorem 8** (Weak Countable Max-Flow Min-Cut).**
Let N be flow network with (a) no infinite directed paths and (b)
finite total capacity leaving ⊤ and entering ⊥. Then,
[TABLE]
and both supremum and infimum are achieved by some flow and cut, respectively.
We are now ready to prove an approximate version of Strassen’s theorem, thereby
showing equivalence between ⋆-liftings and Sato’s liftings.
Theorem 9**.**
Let μ1∈D(A) and μ2∈D(B), R be a binary
relation over A and B and ε,δ∈R≥0. Then,
μ1Rε,δ(⋆)μ2
iff
∀X⊆A.μ1(X)≤eε⋅μ2(R(X))+δ.
Proof 3.7**.**
We detail the reverse direction; the forward direction is immediate. We can
assume that A and
B are countable; in the case where A and B are not both
countable, we first consider the restriction of μ1 and μ2
to their respective supports—which are countable sets—and
construct witnesses to the ⋆-lifting. The witnesses can then
be extended to an approximate coupling of μ1 and μ2 by adding a null
mass to the extra points.
Let ω=△∣μ2∣+e−ε⋅δ and let
⊤ and ⊥ be fresh symbols. For any set X, define
X⊤ and X⊥ resp. as {x⊤∣x∈X}
and {x⊥∣x∈X}.
Let N be the flow network of Figure 1
whose resp. source and sink are ⊤ and ⊥, whose set of
vertices V is
{⊤,⊥}⊎(A⋆)⊤⊎(B⋆)⊥,
and whose set of edges E is
E⊤⊎E⊥⊎ER⊎E⋆ with
[TABLE]
Let C be a cut of N. In the following, we sometimes use C
to denote both the cut C and its cut-set E(C). We check
∣C∣≥ω. If C∩ER=∅ then
∣C∣=∞.
Note that C∩E⋆=∅ implies
C∩ER=∅.
If (⊤,⋆⊤)∈C and
(⊥,⋆⊥)∈/C then we must have
E⊤⊆C. This implies that ∣C∣≥ω
since E⊤⊎{(⊤,⋆⊤)} is a cut with
capacity ω. If (⊤,⋆⊤)∈/C and
(⊥,⋆⊥)∈C then we have
∣C∣≥ω in the similar way as above.
Otherwise (i.e. C∩ER=∅ and
E⋆⊆C), for C to be a cut, we must have
R(A−A†)⊆B† where
A†=△{x∈A∣(⊤,x⊤)∈C}
and
B†=△{y∈B∣(y⊥,⊥)∈C}.
Thus,
[TABLE]
Hence, E⊤⊎{(⋆⊥,⊥)} is a minimum
cut with capacity ω.
By Theorem 8,
we obtain a maximum flow f with mass ω.
Note that the flow f saturates the capacity of
all edges in E⊤, E⊥, and E⋆.
Let f^:(a,b)∈A⋆×B⋆↦f(a⊤,b⊥).
We now define the following distributions:
[TABLE]
We clearly have π1♯(η\mspace−1.5mu⊲)=μ1 and
π2♯(η\mspace−1.5mu⊳)=μ2.
Moreover, by construction of the
flow network N, supp(f^∣A×B)⊆R. Hence,
supp(η\mspace−1.5mu⊲∣A×B),supp(η\mspace−1.5mu⊳∣A×B)⊆R.
It remains to show that
Δε(η\mspace−1.5mu⊲,η\mspace−1.5mu⊳)≤δ.
Let X be a subset of A⋆×B⋆.
Let
Xa=△{a∈A∣(a,⋆)∈X},
Xb=△{b∈B∣(⋆,b)∈X} and
X=△X∩(A×B).
Then,
[TABLE]
The last equality holds by Kirchhoff’s law:
f^[A×{⋆}]=∑a∈Af(a⊤,⋆⊥)=f(⋆⊥,⊥)=e−ε⋅δ.
+=
Proof 3.8** (Proof of ).**
To show the forward direction of Theorem 9, let X⊆A
and R(X)=B−R(X).
Then, we have
[TABLE]
as desired.
3.3. Alternative Proof of Approximate Strassen’s Theorem
We can provide an alternative, arguably simpler proof of the reverse direction
of the approximate Strassen’s theorem (Theorem 9). Instead of
relying on the max-flow min-cut theorem for countable networks by
Aharoni et al. [2011], we apply the more standard result on
finite networks and then pass from approximate liftings on finite restrictions
of the two target distributions to an approximate lifting of the limit
distributions, via a limiting argument. The results of this section have been
formalized in the Coq proof
assistant.222https://github.com/strub/xhl
We first start with a simple technical lemma.
Lemma 10**.**
Let μ1∈D(A) and μ2∈D(B) (with respective
support S1 and S2), R be a binary relation over
A and B and ε,δ∈R≥0. Then,
(μ1)∣S1Rε,δ(⋆)(μ2)∣S2
implies
μ1Rε,δ(⋆)μ2.
If ηL∈D(S1×S2⋆),
ηR∈D(S1⋆×S2) are the witnesses of
(μ1)∣S1Rε,δ(⋆)(μ2)∣S2,
their extensions to D(A×B⋆) and
D(A⋆×B), padding with [math], form witnesses for
μ1Rε,δ(⋆)μ2.
We can now prove the theorem for distributions over finite domains:
Let μ1∈D(A) and μ2∈D(B) be sub-distributions with
finite supports, R be a binary relation over A and B and
ε,δ∈R≥0. If for all X⊆A we have μ1[X]≤eε⋅μ2[R(X)]+δ, then μ1Rε,δ(⋆)μ2.
The case for distributions with finite domains can be proven using
the same proof of Theorem 9, using the standard
(finite) max-flow min-cut theorem. The result extends to
distributions with finite supports via Lemma 10.
To extend the finite case to distributions over countable sets, we need a lemma
that will allow us to assume that the witnesses are within a multiplicative
factor of each other, except possibly on pairs with ⋆.
Lemma 12**.**
Let μ1∈D(A) and μ2∈D(B) s.t.
μ1Rε,δ(⋆)μ2.
Then, there exists η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ witnessing the
lifting s.t. for (a,b)∈A×B, we have:
Let μ\mspace−1.5mu⊲, μ\mspace−1.5mu⊳ be witnesses of
μ1Rε,δ(⋆)μ2.
For (a,b)∈A×B, let
[TABLE]
and define η\mspace−1.5mu⊲, η\mspace−1.5mu⊳ as
[TABLE]
Note that η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ are well-defined as
sub-distributions by the marginal conditions for μ\mspace−1.5mu⊲ and
μ\mspace−1.5mu⊳. To show that the witnesses are non-negative, let a∈A and b∈B. We have:
[TABLE]
To show that the witnesses sum to at most 1, we have
[TABLE]
Moreover, these witness distributions satisfy the claimed distance condition.
Indeed for (a,b)∈A×B, we have:
[TABLE]
It remains to prove that η\mspace−1.5mu⊲, η\mspace−1.5mu⊳ are witnesses
for μ1Rε,δ(⋆)μ2. The marginals
conditions are obvious. For the support condition, let a∈A and
b∈B. Then, η\mspace−1.5mu⊲(a,b)>0 (resp.
η\mspace−1.5mu⊳(a,b)>0) implies μ\mspace−1.5mu⊲(a,b)>0 (resp.
η\mspace−1.5mu⊳(a,b)>0) and hence that aRb.
The distance condition follows by a calculation. For X⊆A⋆×B⋆, we have:
[TABLE]
To conclude, it suffices to show that
η\mspace−1.5mu⊲[X∩(A×{⋆})]≤δ.
For that, let
[TABLE]
Then, we can bound
[TABLE]
Now, let
S=△{(a,b)∈A⋆×B⋆∣eε⋅μ\mspace−1.5mu⊳(a,b)<μ\mspace−1.5mu⊲(a,b)}.
By the distance condition on μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳, we have
μ\mspace−1.5mu⊲[S]−eε⋅μ\mspace−1.5mu⊳[S]≤δ.
Hence we conclude the desired distance condition:
[TABLE]
We can now prove the reverse direction of Theorem 9.
By Lemma 10, without loss of generality, we can
assume that A and B are countable. Hence, there exists a family
{An}n (resp. {Bn}n) of increasing finite subsets of
A s.t. ∪iAi=A (resp. of B s.t. ∪iBi=B). For
n∈N, we denote by μ1n and μ2n the domain
restrictions of μ1 to An and μ2 to Bn,
i.e., μ1n(a)=μ1(a) if a∈An, [math] otherwise and
μ2n(b)=μ2(b) if b∈Bn, [math] otherwise.
Fix n∈N and let X⊆A. We have:
[TABLE]
Hence, by Lemma 11, we have
μ1nRε,δn(⋆)μ2n. By
Lemma 12, we can moreover assume that
μ1nRε,δn(⋆)μ2n is witnessed by
sub-distributions η\mspace−1.5mu⊲n and η\mspace−1.5mu⊳n such that
[TABLE]
Since R(X)∩Bnn→∞∅, we
also have δnn→∞δ.
As a countable product of a sequentially compact sets,
[0,1]A×B⋆ and [0,1]A⋆×B are
sequentially compact, and we can find a subsequence of indices
{ωn}n∈N s.t. both η\mspace−1.5mu⊲ωn,
η\mspace−1.5mu⊳ωn resp. converge pointwise to
sub-distributions η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳.
We now prove that these sub-distributions are witnesses for
μ1Rε,δ(⋆)μ2. It is clear that the
supports of η\mspace−1.5mu⊲∣A×B and
η\mspace−1.5mu⊳∣A×B are included in R. We now
detail the marginals and distance conditions.
First, note that
From (3a), it is clear that
b∈B⋆↦η\mspace−1.5mu⊲ωn(a,b) is absolutely
dominated by the summable function
[b\in B^{\star}\mapsto\text{e^{\varepsilon}\cdot\mu_{2}(b)ifb\in B,else1}].
By the dominated convergence theorem, we can swap the limit and summation:
[TABLE]
For the second marginal, let b∈B. We have:
[TABLE]
From (3b), we have that
a∈A⋆↦η\mspace−1.5mu⊲ωn(a,b) is absolutely
dominated by the summable function:
[a\in A^{\star}\mapsto\text{\mu_{1}(a)ifa\in A,else1}].
Again by the dominated convergence theorem, we can swap the limit and
summation:
[TABLE]
We are left to prove the distance condition, i.e., that for any
X⊆A∗×B∗, we have
η\mspace−1.5mu⊲[X]−eε⋅η\mspace−1.5mu⊳[X]≤δ.
First, note that
η\mspace−1.5mu⊲[X]=limn→∞η\mspace−1.5mu⊲ωn[X]
and
η\mspace−1.5mu⊳[X]=limn→∞η\mspace−1.5mu⊳ωn[X].
Indeed, for η\mspace−1.5mu⊲[X], we have
[TABLE]
where X\mspace−1.5mu⊲a=△{b∈B⋆∣(a,b)∈X}.
The first application of the dominated convergence theorem
uses, like for the first marginal condition, the function
[b\in B^{\star}\mapsto\text{e^{\varepsilon}\cdot\mu_{2}(b)ifb\in B,else1}]
as the dominating function (using Equation (3a)). The
second application of the dominated convergence theorem uses
[a\in A^{\star}\mapsto\text{\mu_{1}(a)ifa\in A,else0}]
as the dominating function. Indeed, if a∈A, then
[TABLE]
Likewise, for η\mspace−1.5mu⊳[X], we have
[TABLE]
where X\mspace−1.5mu⊳b=△{a∈A⋆∣(a,b)∈X}.
Here, the first application of the dominated convergence theorem
uses, as for the second marginal condition, the function
[a\in A^{\star}\mapsto\text{\mu_{1}(a)ifa\in A,else1}]
as the dominating function (using equation (3b)). The
second application of the dominated convergence theorem uses
[b\in B^{\star}\mapsto\text{\mu_{2}(b)ifb\in B,else0}]
as the dominating function. Indeed, if b∈B, then
[TABLE]
Hence, we can conclude the distance condition by taking limits:
[TABLE]
This proof constructs witnesses to an approximate lifting relating two
distributions (μ1,μ2) given a sequence of approximate liftings relating
pairs of finite restrictions of μ1 and μ2. Using essentially the same
argument, we can construct an approximate lifting relating (μ1,μ2)
given a sequence of approximate liftings relating pairs of distributions
converging to μ1 and μ2; the main difference is that a slightly more
general form of the dominated convergence theorem is needed (see Hsu [2017, Lemma
5.1.7] for details).
4. Properties of ⋆-Liftings
Our main theorem can be used to show several natural properties of
⋆-liftings. To begin, we can improve the mapping property from
Theorem 2, lifting the requirement that the maps must be
surjective.
Lemma 13**.**
Let μ1∈D(A1), μ2∈D(A2),
f1:A1→B1, f2:A2→B2 and R a binary relation
on B1 and B2. Let S such that a1Sa2⟺△f1(a1)Rf2(a2). Then
Assume that
f1♯(μ1)Rε,δ(⋆)f2♯(μ2)
and let X⊆A1. Then,
[TABLE]
Hence, by Theorem 9,
μ1Sε,δ(⋆)μ2.
**(⟸): **
Assume that
μ1Sε,δ(⋆)μ2
and let X⊆A2. Then,
[TABLE]
Hence, by Theorem 9,
f1♯(μ1)Rε,δ(⋆)f2♯(μ2).
Similarly, we can generalize the existing rules for up-to-bad reasoning (cf. Barthe et al. [2016a, Theorem 13]), which restrict the post-condition to be
equality. There are two versions: the conditional event is either on the left
side, or the right side. Note that the resulting indices δ are
different in the two cases. We write θ for the complement of
θ.
Lemma 14**.**
Let μ1∈D(A), μ2∈D(B), θ⊆A
and R⊆A×B. Assume that
μ1(θ\mspace−1.5mu⊲⟹R)ε,δ(⋆)μ2
for some parameters ε,δ≥0.
Then μ1Rε,δ(⋆)μ2,
where δ=△δ+μ1[θ].
Let μ1∈D(A), μ2∈D(B), θ⊆B
and R⊆A×B. Assume that
μ1(θ\mspace−1.5mu⊳⟹R)ε,δ(⋆)μ2
for some parameters ε,δ≥0.
Then, μ1Rε,δ(⋆)μ2,
where δ=△δ+eε⋅μ2[θ].
As a consequence, an approximately lifted relation can be conjuncted with a
one-sided predicate if the δ parameter is increased. This principle is
useful for constructing approximate liftings based on accuracy bounds.
For instance, suppose that we have an approximate lifting of R relating two
distributions, μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳. If θa is an event that
happens with high probability in the first distribution μ\mspace−1.5mu⊲—say, a
certain noise variable is at most 100—we can incorporate θa,⊲ into the approximate lifting by increasing the δ parameter by
the probability that θafails to hold in μ\mspace−1.5mu⊲. When
reasoning in terms of approximate couplings, intuitively we can “assume”
θa holds by “paying” with an increase in δ. A similar
property holds for the second distribution μ\mspace−1.5mu⊳.
We formalize these constructions with the following lemma.
Lemma 16**.**
Let μ1∈D(A), μ2∈D(B),
θa⊆A, θb⊆B and
R⊆A×B.
Assume that
μ1Rε,δ(⋆)μ2.
Then,
μ1(θa,⊲∩R)ε,δa(⋆)μ2
and
μ1(θb,⊳∩R)ε,δb(⋆)μ2
where δa=△δ+μ1[θa] and
δb=△δ+eε⋅μ2[θb].
From
μ1Rε,δ(⋆)μ2
and Lemma 6, we have
μ1Sε,δ(⋆)μ2
where
S=△θa,⊲⟹θa,⊲∩R.
Hence, by Lemma 14, we obtain
μ1(θa,⊲∩R)ε,δa(⋆)μ2.
Using similar reasoning with
θb,⊳⟹θb,⊳∩R
and Lemma 15, we have
μ1(θb,⊳∩R)ε,δb(⋆)μ2.
⋆-liftings also support a significant generalization of optimal subset
coupling. Unlike the known construction for 2-liftings
(Theorem 3), the two subsets need not be nested, and either
subset may be the entire domain. Furthermore, the distributions μ1,μ2
need not be the same, or even have the same domain. Finally, the equivalence is
valid for any parameters (ε,δ), not just δ=0.
Theorem 17**.**
Let μ1∈D(A1), μ2∈D(A2) and consider two subsets P1⊆A1,P2⊆A2. Then, we have the following equivalence:
is automatic since P2⊆P1 implies Pμ[A−P1]≤Pμ[A−P2]. Note that there is no longer a need for P1 to
be a strict subset of A.
Finally, we can directly extend known composition theorems from differential
privacy to ⋆-liftings. This connection is quite useful for transferring
existing composition results from the privacy literature to approximate
liftings. We first define a general template describing how the privacy
parameters ε,δ decay under sequential composition.
{defi}
Let R2≥0=△R≥0×R≥0 and let
(R2≥0)∗ be the set of finite sequences over pairs of non-negative
reals. A map r:(R2≥0)∗→R2≥0 is a
DP-composition rule if for all sets A,D, adjacency relations ϕ⊆D×D, and families of functions {fi:D×A→D(A)}i<n, the following implication holds: if for every initial
value a∈A and i<n, fi(−,a):D→D(A) is (εi,δi)-differentially private w.r.t. ϕ, then F(−,a) is
(ε∗,δ∗)-differentially private w.r.t. ϕ and any initial
value a∈A where F:(d,a)↦(◯i<n(fi(d,−))♯)(1a) is the n-fold composition of the functions
[fi]i<n and (ε∗,δ∗)=△r([(εi,δi)]i<n).
Lemma 19**.**
Let r:(R2≥0)∗→R2≥0 be a DP-composition rule.
Let n∈N and assume given two families of sets
{Ai}i≤n and {Bi}i≤n, together with a
family of binary relations
{R(i)⊆Ai×Bi}i≤n.
Let
{gi:Ai→D(Ai+1)}i<n and
{hi:Bi→D(Bi+1)}i<n be two families of functions s.t.
for all i<n and (a,b)∈R(i), we have:
(1)
gi(a)R(i+1)εi,δi(⋆)hi(b)* for some parameters εi,δi≥0, and*
2. (2)
gi(a)* and hi(b) are proper distributions.*
Then for (a0,b0)∈R(0), there exists a ⋆-lifting
[TABLE]
where G:A0→D(An) and H:B0→D(Bn) are the n-fold
compositions of [gi]i≤n and [hi]i≤n
respectively—i.e. G(a)=△(◯i<ngi♯)(1a) and
H(b)=△(◯i<nhi♯)(1b)—and (ε∗,δ∗)=△r([(εi,δi)]i<n).
Proof 4.11**.**
We assume that Ai=A is the same for all i, and Bi=B is the same
for all i. This is without loss of generality, since when Ai and Bi
vary with i we may work with the disjoint unions ⊔iAi and
⊔iBi by restricting each R(i) to only relate pairs that are both
in the i-th component. Define D={0,1}={tt,ff} and
ϕ={(tt,ff)}⊆D×D.
For every i<n and (ai,bi)∈R(i), the definition of
⋆-lifting gives two distributions μ\mspace−1.5mu⊲[ai,bi],μ\mspace−1.5mu⊳[ai,bi] witnessing gi(a)R(i+1)εi,δi(⋆). We regard both witness distributions as elements of
D(A⋆×B⋆) via the evident embeddings. We define the maps
fi:D×(A⋆×B⋆)→D(A⋆×B⋆) by
cases:
[TABLE]
where × denotes the product distribution and (ai,bi)∈R(i);
otherwise, fi(−,(a,b))=0.
Now for all (a,b)∈A⋆×B⋆, the map fi(−,(a,b)):D→D(A⋆×B⋆) is (εi,δi)-differentially
private with respect to ϕ by the distance property on μ\mspace−1.5mu⊲[ai,bi] and μ\mspace−1.5mu⊳[ai,bi] (and by definition when (ai,bi)∈/R(i)). Hence, the DP-composition rule implies that F:(d,(a,b))↦(◯i<n(fi(d,−))♯)(1(a,b)) is (ε∗,δ∗)-differentially private with respect to ϕ for any (a,b)∈A⋆×B⋆. For any (a0,b0)∈R(0), we claim that
that F(tt,(a0,b0)) and F(ff,(a0,b0)) witness the
desired approximate lifting
[TABLE]
The support and marginal conditions are not hard to show, and the distance
condition follows from differential privacy of F(−,(a0,b0)):D→D(A⋆×B⋆).
Some of the more sophisticated composition results from differential
privacy—for instance, the advanced composition theorem by Dwork et al. [2010]—do
not apply to arbitrary adjacency relations ϕ, but only symmetric
relations. Lemma 19 cannot lift such theorems to composition
principles for approximate liftings. In Section 6 we will remedy this
problem by working with a symmetric version of ⋆-lifting.
5. Comparison with Prior Approximate Liftings
Now that we have seen ⋆-liftings, we briefly consider other definitions of
approximate liftings. We have already seen 2-liftings, which involve two
witnesses (Definition 1). Evidently, ⋆-liftings strictly
generalize 2-liftings.
Theorem 20**.**
For all binary relations R over A and B and parameters
ε,δ≥0, we have
Rε,δ(2)⊆Rε,δ(⋆).
There exist relations and parameters where the inclusion is strict.
Proof 5.1**.**
The inclusion Rε,δ(2)⊆Rε,δ(⋆) is immediate. We have a strict inclusion
Rε,δ(2)⊊Rε,δ(⋆) even for δ=0 by considering the optimal subset coupling
from Theorem 3. Consider a distribution μ over set A,
and let P1⊆P2=A. There is an (ε,0)-approximate
⋆-lifting (by Theorem 17), but a (ε,0)-approximate 2-lifting does not exist if μ has non-zero mass outside
of P1: the first witness μ\mspace−1.5mu⊲ must place non-zero mass at
(a1,a2) with a1∈/P1 in order to have π1♯(μ\mspace−1.5mu⊲)=μ,
but we must have a2∈/P2 for the support requirement, and there is no
such a2.
We can also compare ⋆-liftings with the original definitions of
(ε,δ)-approximate lifting by Barthe et al. [2013]. They introduce
two notions, a symmetric lifting and an asymmetric lifting, each using a single
witness distribution. We will focus on the asymmetric version here, and return
to the symmetric version in Section 6.
{defi}
[Barthe et al. [2013]]
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R≥0 and R be a binary relation over A and B. An (ε,δ)-approximate1-lifting
of μ1 and μ2 for R is a sub-distribution μ∈D(A×B) s.t.
(1)
π1♯(μ)≤μ1 and π2♯(μ)≤μ2;
2. (2)
Δε(μ1,π1♯(μ))≤δ; and
3. (3)
supp(μ)⊆R.
In the first point we take the point-wise order on sub-distributions: if μ
and μ′ are sub-distributions over X, then μ≤μ′ when μ(x)≤μ′(x) for all x∈X. We will write
μ1Rε,δ(1)μ2
if there exists an (ε,δ)-approximate 1-lifting of μ1 and μ2 for R; the superscript ⋅(1) indicates that
there is one witness for this lifting.
1-liftings bear a close resemblance to probabilistic couplings from
probability theory, which also have a single witness. However, 1-liftings are
more awkward to manipulate and less well-understood theoretically than their
2-lifting cousins—basic properties such as mapping (Lemma 13)
are not known to hold; the subset coupling (Theorem 3) is not
known to exist. Somewhat surprisingly, 1-liftings are equivalent to
⋆-liftings and hence by Theorem 9, also to Sato’s
approximate lifting.
Theorem 21**.**
For all binary relations R over A and B and parameters
ε,δ≥0, we have
Rε,δ(1)=Rε,δ(⋆).
Suppose that (μL,μR) are witnesses to μ1Rε,δ(⋆)μ2. Define the witness η∈D(A×B) as the
point-wise minimum: η(a,b)=min(μL(a,b),μR(a,b)). We
will show that η is a witness to μ1Rε,δ(1)μ2.
The support condition follows from the support condition for (μL,μR).
The marginal conditions π1♯(η)≤μ1 and π2♯(η)≤μ2
also follow by the marginal conditions for (μL,μR). The only thing to
check is the distance condition. By the distance condition on (μL,μR), there exist non-negative values δ(a,b) such that
[TABLE]
and ∑a,bδ(a,b)≤δ. So, μR(a,b)≥exp(−ε)(μL(a,b)−δ(a,b)). Now let S⊆A be
any subset. Then:
[TABLE]
Thus, η witnesses μ1Rε,δ(1)μ2, so
Rε,δ(⋆)⊆Rε,δ(1).
The other direction is more interesting. Let η∈D(A×B) be
the witness for Rε,δ(1). By the distance
condition Δε(μ1,π1♯η)≤δ, there exist
non-negative values δ(a) such that
[TABLE]
with equality when δ(a) is strictly positive, and ∑a∈Aδ(a)≤δ. Define two witnesses μL∈D(A×B⋆),μR∈D(A⋆×B) as follows:
[TABLE]
(As usual, if any denominator is zero, we take the probability to be zero as
well.)
The support condition follows from the support condition of η. The
marginal conditions hold by definition. Note that all probabilities are
non-negative. For μL, note that if δ(a)>0 then μ1(a)−δ(a)=exp(ε)π1♯η(a)≥0 and hence
[TABLE]
assuming π1♯η(a)>0; if π1♯η(a)=0 then μL(a,⋆)=0.
For μR, non-negativity holds because π2♯η≤μ2.
We show the distance bound. Note that when a,b=⋆, by definition
μL(a,b) and μR(a,b) are both strictly positive or both equal to
zero, and η(a,b) is strictly positive or equal to zero accordingly. If
μL(a,b),μR(a,b),η(a,b) are all strictly positive, then we know
[TABLE]
Thus we always have
[TABLE]
We can also bound the mass on points (a,⋆). Let S⊆A be
any subset. Then:
[TABLE]
So Δε(μL,μR)≤δ as
desired, and we have witnesses to μ1Rε,δ(⋆)μ2. Hence, Rε,δ(1)⊆Rε,δ(⋆).
6. Symmetric ⋆-Lifting
The approximate liftings we have considered so far are all asymmetric.
For instance, the approximate lifting μ1Rε,δ(⋆)μ2 may not imply the lifting μ2(R−1)ε,δ(⋆)μ1. Given witnesses (μL,μR) to the first lifting, we may consider
the witnesses (νL,νR)=(μR⊤,μL⊤) where the
transpose map (−)⊤:D(A×B)→D(B×A) is defined in
the obvious way. Then (νL,νR) almost witness the second lifting—the
marginal and support conditions holds, but the distance bound is in the wrong
direction:
[TABLE]
In general, we cannot bound Δε(νL,νR) and the symmetric
lifting μ2(R−1)ε,δ(⋆)μ1 may not hold.
To recover symmetry, we can define a symmetric version of ⋆-lifting.
{defi}[Symmetric ⋆-lifting]
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R≥0 and R be a binary relation over A and B. An (ε,δ)-approximate symmetric⋆-lifting of μ1 and μ2 for R is a pair of
sub-distributions
η\mspace−1.5mu⊲∈D(A×B⋆) and
η\mspace−1.5mu⊳∈D(A⋆×B) s.t.
(1)
π1♯(η\mspace−1.5mu⊲)=μ1 and π2♯(η\mspace−1.5mu⊳)=μ2;
2. (2)
supp(η\mspace−1.5mu⊲∣A×B),supp(η\mspace−1.5mu⊳∣A×B)⊆R; and
3. (3)
Δε(η\mspace−1.5mu⊲,η\mspace−1.5mu⊳)≤δ,Δε(η\mspace−1.5mu⊳,η\mspace−1.5mu⊲)≤δ,
where η∙ is the canonical lifting of
η∙ to A⋆×B⋆.
We write
μ1Rε,δ(⋆)μ2
if there exists an (ε,δ)-approximate symmetric lifting of
μ1 and μ2 for R.
Symmetric ⋆-lifting is a special case of ⋆-lifting that can capture
differential privacy under when the adjacency relation ϕ is
symmetric: a probabilistic computation M:A→D(B) is
(ε,δ)-differentially private if and only if for every two
adjacent inputs aϕa′, there is an approximate lifting of the
equality relation: M(a)(=)ε,δ(⋆)M(a′).
Unfortunately, the more advanced properties in Section 4 do not all
hold when moving to symmetric liftings. However, we can show that symmetric
⋆-liftings are equivalent to the symmetric version of 1-witness lifting
proposed by Barthe et al. [2013].
{defi}
[Barthe et al. [2013]]
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R≥0 and R be a binary relation over A and B. An (ε,δ)-approximate symmetric1-lifting of μ1 and μ2 for R is a
sub-distribution μ∈D(A×B) s.t.
(1)
π1♯(μ)≤μ1 and π2♯(μ)≤μ2;
2. (2)
Δε(μ1,π1♯(μ))≤δ and
Δε(μ2,π2♯(μ))≤δ; and
3. (3)
supp(μ)⊆R.
We will write
μ1Rε,δ(1)μ2
if there exists an (ε,δ)-approximate symmetric 1-lifting of
μ1 and μ2 for R; the superscript ⋅(1) indicates
that there is one witness for this lifting.
Theorem 22** (cf. the asymmetric result Theorem 21).**
For all binary relations R over A and B and parameters
ε,δ≥0, we have
Rε,δ(1)=Rε,δ(⋆).
Suppose that (μL,μR) are witnesses to μ1Rε,δ(⋆)μ2. Define the witness η∈D(A×B) as the point-wise minimum: η(a,b)=min(μL(a,b),μR(a,b)). We will show that η is a witness to
μ1Rε,δ(1)μ2.
The support condition follows from the support condition for (μL,μR).
The marginal conditions π1♯(η)≤μ1 and π2♯(η)≤μ2
also follow by the marginal conditions for (μL,μR). The only thing to
check is the distance condition. By the distance condition on (μL,μR), there exist non-negative values δ(a,b) such that
[TABLE]
and ∑a,bδ(a,b)≤δ. So, μR(a,b)≥exp(−ε)(μL(a,b)−δ(a,b)). Similarly, there are
non-negative values δ′(a,b) such that
[TABLE]
and ∑a,bδ′(a,b)≤δ. So, μL(a,b)≥exp(−ε)(μR(a,b)−δ′(a,b)).
Now let S⊆A be any subset. Then:
[TABLE]
The other marginal is similar. For any subset T⊆B we have
[TABLE]
Thus, η witnesses μ1Rε,δ(1)μ2.
The other direction is more interesting. Let η∈D(A×B) be
the single witness to Rε,δ(1). By the
distance conditions Δε(μ1,π1♯η)≤δ and
Δε(μ2,π2♯η)≤δ, there exist non-negative
values δ(a) and δ′(b) such that
[TABLE]
there is equality when δ(a) or δ′(b) are strictly positive, and
both ∑a∈Aδ(a) and ∑b∈Bδ′(b) are at
most δ. Define two witnesses μL∈D(A×B⋆),μR∈D(A⋆×B) as follows:
[TABLE]
(As usual, if any denominator is zero, we take the probability to be zero as
well.)
The support condition follows from the support condition of η. The
marginal conditions hold by definition. Note that all probabilities are
non-negative. For instance in μL, note that if δ(a)>0 then
μ1(a)−δ(a)=exp(ε)π1♯η(a)≥0 and hence
[TABLE]
assuming π1♯η(a)>0; if π1♯η(a)=0 then μL(a,⋆)=0.
A similar argument shows that μR is non-negative.
So, it remains to check the distance bounds. Note that when a,b=⋆, by definition μL(a,b) and μR(a,b) are both strictly
positive or both equal to zero, and η(a,b) is strictly positive or equal
to zero accordingly. If μL(a,b),μR(a,b),η(a,b) are all
strictly positive, then we know
[TABLE]
We can also lower bound the ratios:
[TABLE]
for instance when δ(a)>0 then the ratio is exactly equal to
exp(ε)≥1, and when δ(a)=0 then the ratio is at least
1 by the marginal property π1♯η≤μ1.
So we have μL(a,b)/η(a,b) and μR(a,b)/η(a,b) in [1,exp(ε)] when all distributions are strictly positive. Thus we always
have
[TABLE]
We can also bound the mass on points (a,⋆). Let S⊆A be
any subset. μR(S×{⋆})≤exp(ε)μL(S×{⋆})+δ is clear. For the other
direction:
[TABLE]
The mass at points (⋆,b) can be bounded in a similar way. Let T⊆B be any subset. Then, μL({⋆}×T)≤exp(ε)μR({⋆}×T)+δ is clear. For
the other direction:
[TABLE]
So Δε(μL,μR)≤δ and
Δε(μR,μL)≤δ so we have
witnesses to μ1Rε,δ(⋆)μ2. Hence,
Rε,δ(1)=Rε,δ(⋆).
The main use of symmetric approximate liftings is to support richer composition
results that only apply to symmetric adjacency relations.
{defi}
Let R2≥0=△R≥0×R≥0 and let
(R2≥0)∗ be the set of finite sequences over pairs of non-negative
reals. A map r:(R2≥0)∗→R2≥0 is a
symmetric DP-composition rule if for all sets A,D, symmetric adjacency relations ϕ⊆D×D, and families of functions {fi:D×A→D(A)}i<n, the following implication holds: if for every initial
value a∈A and i<n, fi(−,a):D→D(A) is (εi,δi)-differentially private w.r.t. ϕ, then F(−,a) is
(ε∗,δ∗)-differentially private w.r.t. ϕ and any initial
value a∈A where F:(d,a)↦(◯i<n(fi(d,−))♯)(1a) is the n-fold composition of the functions
[fi]i<n and (ε∗,δ∗)=△r([(εi,δi)]i<n).
We have the following reduction, a symmetric version of Lemma 19.
Lemma 23**.**
Let r:(R2≥0)∗→R2≥0 be a symmetric DP-composition
rule. Let n∈N and assume given two families of sets {Ai}i≤n and {Bi}i≤n, together with a family of binary relations
{R(i)⊆Ai×Bi}i≤n.
Fix two families of functions
{gi:Ai→D(Ai+1)}i<n and
{hi:Bi→D(Bi+1)}i<n s.t.
for any i<n and (a,b)∈R(i) we have:
(1)
gi(a)R(i+1)εi,δi(⋆)hi(b)* for some parameters εi,δi≥0, and*
2. (2)
gi(a)* and hi(b) are proper distributions.*
Then for (a0,b0)∈R(0), there exists a symmetric ⋆-lifting
[TABLE]
where G:A0→D(An) and H:B0→D(Bn) are the n-fold
compositions of [gi]i≤n and [hi]i≤n
respectively—i.e. G(a)=△(◯i<ngi♯)(1a) and
H(b)=△(◯i<nhi♯)(1b) and (ε∗,δ∗)=△r([(εi,δi)]i<n).
Proof 6.3**.**
Essentially the same as the proof of Lemma 19. Let D={tt,ff} as before, and take ϕ={(tt,ff),(ff,tt)} be a binary
relation on D.
For every i<n and (ai,bi)∈R(i), the definition of symmetric
⋆-lifting gives two distributions μ\mspace−1.5mu⊲[ai,bi],μ\mspace−1.5mu⊳[ai,bi] witnessing gi(a)R(i+1)εi,δi(⋆). We define the same maps fi:D×(A⋆×B⋆)→D(A⋆×B⋆) as before:
[TABLE]
for (ai,bi)∈R(i); otherwise, fi(−,(a,b))=0.
Compared to proof of Lemma 19, the crucial difference is that
since we have witnesses to a symmetric approximate lifting, the
resulting maps fi(−,(a,b)):D→D(A⋆×B⋆) are
(εi,δi)-differentially private with respect to the
symmetric relation ϕ, not just the asymmetric
relation ϕ. Hence, we may apply the symmetric DP-composition rule r
and conclude as before.
With this reduction we hand, we can generalize the advanced composition
theorem from differential privacy to ⋆-liftings.
Theorem 24** (Advanced composition [Dwork et al., 2010]).**
Consider a symmetric adjacency relation ϕ on databases D. Let
fi:D×A→D(A) be a sequence of n functions, such that for
every a∈A the functions fi(−,a):D→D(A) are (ε,δ)-differentially private with respect to ϕ. Then, for every a∈A and ω∈(0,1), running f1,…,fn in sequence is
(ε∗,δ∗)-differentially private for
[TABLE]
Corollary 25**.**
Let n be a natural number, ε,δ≥0, and ω∈(0,1) be real parameters. Suppose we have:
(1)
sets {Ai}i,{Bi}i with i ranging from 0,…,n;
2. (2)
relations {R(i)}i on Ai and Bi with i ranging
from 0,…,n; and
3. (3)
functions {fi:Ai→D(Ai+1)}i,{gi:Bi→D(Bi+1)}i with i ranging from 0,…,n−1
such that for all (a,b)∈R(i), we have
[TABLE]
and fi(a),gi(b) proper distributions. Then, there is an approximate
lifting of the compositions:
[TABLE]
for every (a0,b0)∈R(0), where F:A0→D(An) and G:B0→D(Bn) are the n-fold (Kleisli) compositions of {fi} and {gi} respectively, and the lifting parameters are:
[TABLE]
Proof 6.4**.**
By the advanced composition theorem for differential privacy
(Theorem 24), the map r([(ε,δ)]i<n)=△(ε′,δ′) is a symmetric DP composition rule. So, we can
conclude by Lemma 23.
7. ⋆-Lifting for f-Divergences
The definition of ⋆-lifting can be extended to lifting
constructions based on general f-divergences, as previously proposed
by Barthe and Olmedo [2013], Olmedo [2014]. Roughly, a f-divergence is a function Δf(μ1,μ2) that measures the difference between two probability
distributions μ1 and μ2. Much like we generalized the
definition for (ε,δ)-liftings, we can define
⋆-lifting with f-divergences. Let us first
formally define f-divergences.
We denote by F the set of non-negative convex
functions vanishing at 1: F={f:R≥0→R≥0∣f(1)=0}. We also adopt the following notational
conventions: 0⋅f(0/0)=△0, and 0⋅f(x/0)=△x⋅limt→0+t⋅f(1/t); we write Lf for the limit.
{defi}
Given f∈F , the f-divergenceΔf(μ1,μ2)
between two distributions μ1 and μ2 in D(A) is defined as
[TABLE]
Examples of f-divergences include statistical distance (f(t)=21∣t−1∣), Kullback-Leibler divergence (f(t)=tln(t)−t+1),333The additional term −t+1 extends the classical definition of
KL-divergence to sub-distributions [Barthe and Olmedo, 2013].
and Hellinger distance (f(t)=21(t−1)2).
{defi}
[⋆-lifting for f-divergences]
Let μ1∈D(A) and μ2∈D(B) be distributions,
R be a binary relation over
A and B, and f∈F.
An (f;δ)-approximate lifting of
μ1 and μ2 for R is a pair of distributions
η\mspace−1.5mu⊲∈D(A×B⋆) and
η\mspace−1.5mu⊳∈D(A⋆×B) s.t.
•
π1♯(η\mspace−1.5mu⊲)=μ1 and π2♯(η\mspace−1.5mu⊳)=μ2;
•
supp(η\mspace−1.5mu⊲∣A×B),supp(η\mspace−1.5mu⊳∣A×B)⊆R; and
•
Δf(η\mspace−1.5mu⊲,η\mspace−1.5mu⊳)≤δ,
where η∙ is the canonical lifting of η∙
to A⋆×B⋆.
We will write:
μ1Rf;δ(⋆)μ2
if there exists an (f;δ)-approximate lifting of
μ1 and μ2 for R.
⋆-liftings for certain f-divergences compose sequentially.
Lemma 26**.**
Suppose f corresponds to statistical distance, Kullback-Leibler, or
Hellinger distance. For i∈{1,2}, let μi∈D(Ai) and
ηi:Ai→D(Bi). Let R (resp. S) be a binary relation
over A1 and A2 (resp. over B1 and B2). If
μ1Rf;δ(⋆)μ2
for some δ≥0 and for any (a1,a2)∈R we have
η1(a1)Sf;δ′(⋆)η2(a2)
for some δ′≥0,
then
[TABLE]
Proof 7.1**.**
Essentially the same as the proof of Lemma 19, lifting known
composition results for these f-divergences (namely, Barthe and Olmedo [2013, Proposition
5]).
Much like the ⋆-liftings we saw before, ⋆-liftings for
f-divergences have witness distributions with support determined by the
support of μ1 and μ2 (cf. Lemma 4).
Lemma 27**.**
Let μ1∈D(A) and μ2∈D(B) be distributions such that
μ1Rf;δ(⋆)μ2 .
Then, there are witnesses with support contained in supp(μ1)⋆×supp(μ2)⋆.
Let μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ be any pair of witnesses to the approximate
lifting. We will construct witnesses η\mspace−1.5mu⊲,η\mspace−1.5mu⊳ with the
desired support. For ease of notation, let Si=△supp(μi) for
i∈{1,2}. Define:
[TABLE]
Evidently, η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ have support in S1⋆×S2⋆. Additionally, it is straightforward to check that
π1♯(η\mspace−1.5mu⊲)=π1♯(μ\mspace−1.5mu⊲)=μ1 and π2♯(η\mspace−1.5mu⊳)=π2♯(μ\mspace−1.5mu⊳)=μ2 so η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ have the desired
marginals.
It only remains to check the distance condition. We can compute:
[TABLE]
Now, note that for all b′∈B⋆−S2, we know μ\mspace−1.5mu⊳(a,b′)=0. Similarly, for all a′∈A⋆−S1, we know
μ\mspace−1.5mu⊲(a′,b)=0. Hence, the last line is equal to
[TABLE]
Thus, η\mspace−1.5mu⊲ and η\mspace−1.5mu⊳ witness the desired ⋆-lifting.
Finally, the mapping property from Lemma 13 holds also for these
⋆-liftings. While the proof of Lemma 13 relies on the
equivalence for Sato’s definition, there is no such equivalence (or definition)
for general f-divergences. Therefore, we must work directly with the
witnesses of the approximate lifting.
Lemma 28**.**
Let μ1∈D(A1), μ2∈D(A2),
g1:A1→B1, g2:A2→B2 and R a binary relation
on B1 and B2. Let S such that a1Sa2⟺△g1(a1)Rg2(a2). Then
For the reverse direction, take the witnesses μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳∈D(A⋆×A⋆) and define witnesses ν\mspace−1.5mu⊲=△(g1⋆×g2⋆)♯(μ\mspace−1.5mu⊲) and ν\mspace−1.5mu⊳=△(g1⋆×g2⋆)♯(μ\mspace−1.5mu⊳), where g1⋆×g2⋆ takes a pair (a1,a2) to the pair (g1(a1),g2(a2)) and maps ⋆ to ⋆. The
support and marginal requirements are clear. The only thing to check is the
distance condition, but this follows from monotonicity of
f-divergences—under the mapping g1⋆×g2⋆:A⋆×A⋆→B⋆×B⋆, the f-divergence can only
decrease (see, e.g., Csiszár and Shields [2004]).
For the forward direction, let ν\mspace−1.5mu⊲,ν\mspace−1.5mu⊳∈D(B⋆×B⋆) be the witnesses to the second lifting. By Lemma 27, we
may assume without loss of generality that supp(ν\mspace−1.5mu⊲) and
supp(ν\mspace−1.5mu⊳)
are contained in
[TABLE]
We aim to construct a pair
of witnesses μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳∈D(A⋆×A⋆) to the first
lifting. The basic idea is to define μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ based on equivalence
classes of elements in A mapping to a particular b∈B, and then smooth
out the probabilities within each equivalence class.
To begin, for a∈A, define
[a]g=△g−1(g(a)) and αi(a)=△Prμi[{a}∣[a]gi].
We take αi(a)=0 when μi([a]fi)=0, and we let
αi(⋆)=0. We define μ\mspace−1.5mu⊲ and μ\mspace−1.5mu⊳ as
[TABLE]
where
[TABLE]
The support and marginal conditions follow from the support and marginal
conditions of ν\mspace−1.5mu⊲, ν\mspace−1.5mu⊳. For instance:
[TABLE]
In the last line, we replace the sum over b2∈g2(A⋆) with a sum
over b2∈B⋆; this holds since the support of g2♯(μ2) is
contained in g2(A), so we can assume that ν\mspace−1.5mu⊲(a,b2)=0 for all b2
outside of g2(A⋆). Then, we can conclude by the marginal condition
π1♯(ν\mspace−1.5mu⊲)=g1♯(μ1). The second marginal is similar.
We now check the distance condition
Δf(μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳)≤δ.
We can split the f-divergence into Δf(μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳)=P0+P1+P2+P3, where
[TABLE]
We will handle each term separately. Evidently P0=0. For P1, we have
[TABLE]
where the sets S=0 and S=0 are
[TABLE]
By further rearranging,
[TABLE]
The final equality is because without loss of generality, we can assume (by
Lemma 27) that ν\mspace−1.5mu⊲,ν\mspace−1.5mu⊳ are zero outside of the
support of g1♯(μ1) and g2♯(μ2), which have support
contained in (g1×g2)(A×A).
The remaining two terms P2 and P3 are simpler to bound. For P2, note
that μ\mspace−1.5mu⊳(a,⋆)=0 for all a∈A. Thus:
[TABLE]
where the last equality is because ν\mspace−1.5mu⊳(b,⋆)=0 for
all b∈B.
Similarly for P3, using μ\mspace−1.5mu⊲(⋆,a)=ν\mspace−1.5mu⊲(⋆,b)=0 for all a∈A and b∈B, we have:
[TABLE]
Putting everything together, we conclude
[TABLE]
by assumption, so μ\mspace−1.5mu⊲,μ\mspace−1.5mu⊳ witness the desired approximate
lifting.
8. Conclusion
We have proposed a new definition of approximate lifting that unifies all known
existing constructions and satisfies an approximate variant of Strassen’s
theorem. Our notion is useful both to simplify the soundness proof of existing
program logics and to strengthen some of their proof rules.
Subsequent to the original publication of this work, researchers have explored
two extensions of ⋆-liftings. First, Albarghouthi and Hsu [2018] develop variable
approximate liftings, a refinement of the (ε,0)-approximate liftings
where the ε parameter is real-valued function A×B→R≥0
and the distance condition on witnesses is generalized to
[TABLE]
In effect, the approximation parameters may vary over pairs of samples instead
of being a uniform upper bound. This refinement allows capturing more precise
approximation bounds, in some cases simplifying proofs of differential privacy.
An (ε,δ) version of variable approximate lifting is currently not
known.
Second, Sato et al. [2018] explore 2-liftings in the continuous case modeling
differential privacy and f-divergences, but also relaxations of differential
privacy based on Rényi divergences [Bun and Steinke, 2016, Mironov, 2017]. These
2-liftings subsume ⋆-liftings; a continuous analogue of the approximate
Strassen theorem is strongly believed to hold but remains to be shown.
We see at least two important directions for future work. First, adapting
existing program logics (in particular, apRHL [Barthe et al., 2013]) to
use ⋆-liftings, and formalizing examples that were out of reach of
previous systems. Second, symmetric ⋆-liftings seem to be an important
notion—for instance, the advanced composition theorem of differential
privacy [Dwork et al., 2010] applies to these liftings—but only existential versions
of the definition are currently known. A universal definition, similar to Sato’s
definition for asymmetric liftings, would give more evidence that symmetric
liftings are indeed a mathematically interesting abstraction, and also give a
more convenient route to constructing such liftings.
Acknowledgments
We thank the anonymous reviewers for their helpful suggestions. This work was
partially supported by a grant from the NSF (TWC-1513694), a grant from the
Simons Foundation (#360368 to Justin Hsu), and the ERC Starting Grant
ProFoundNet (#679127).
Appendix A Detailed Proofs
In the proofs, we will sometimes refer to the witnesses of a ⋆-lifting.
{nota}
Let μ1∈D(A) and μ2∈D(B) be sub-distributions,
ε,δ∈R+ and R be a binary relation over A and B.
If two distributions η\mspace−1.5mu⊲∈D(A×B⋆) and η\mspace−1.5mu⊳∈D(A⋆×B) are witnesses to the ⋆-lifting μ1Rε,δ(⋆)μ2, then we write:
[TABLE]
\prooftoks\prooftoks
Bibliography21
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1Aharoni et al. [2011] R. Aharoni, E. Berger, A. Georgakopoulos, A. Perlstein, and P. Sprüssel. The max-flow min-cut theorem for countable networks . J. Comb. Theory, Ser. B , 101(1):1–17, 2011. · doi ↗
2Albarghouthi and Hsu [2018] A. Albarghouthi and J. Hsu. Synthesizing coupling proofs of differential privacy . Proceedings of the ACM on Programming Languages , 2(POPL), Jan. 2018. Appeared at ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL), Los Angeles, California.
3Barthe and Olmedo [2013] G. Barthe and F. Olmedo. Beyond differential privacy: Composition theorems and relational logic for f 𝑓 f -divergences between probabilistic programs . In International Colloquium on Automata, Languages and Programming (ICALP), Riga, Latvia , volume 7966 of Lecture Notes in Computer Science , pages 49–60. Springer-Verlag, 2013.
4Barthe et al. [2013] G. Barthe, B. Köpf, F. Olmedo, and S. Zanella-Béguelin. Probabilistic relational reasoning for differential privacy . ACM Transactions on Programming Languages and Systems , 35(3):9, 2013.
5Barthe et al. [2016 a] G. Barthe, N. Fong, M. Gaboardi, B. Grégoire, J. Hsu, and P.-Y. Strub. Advanced probabilistic couplings for differential privacy . In ACM SIGSAC Conference on Computer and Communications Security (CCS), Vienna, Austria , 2016 a.
6Barthe et al. [2016 b] G. Barthe, M. Gaboardi, B. Grégoire, J. Hsu, and P.-Y. Strub. Proving differential privacy via probabilistic couplings . In IEEE Symposium on Logic in Computer Science (LICS), New York, New York , 2016 b.
7Barthe et al. [2016 c] G. Barthe, M. Gaboardi, J. Hsu, and B. C. Pierce. Programming language techniques for differential privacy . SIGLOG News , 3(1):34–53, 2016 c.
8Bun and Steinke [2016] M. Bun and T. Steinke. Concentrated differential privacy: Simplifications, extensions, and lower bounds . In IACR Theory of Cryptography Conference (TCC), Beijing, China , volume 9985 of Lecture Notes in Computer Science , pages 635–658. Springer-Verlag, 2016. · doi ↗