Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in   Deep Learning with Provable Robustness

NhatHai Phan; Minh Vu; Yang Liu; Ruoming Jin; Dejing Dou; Xintao Wu,; and My T. Thai

arXiv:1906.01444·cs.CR·June 5, 2019

Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness

NhatHai Phan, Minh Vu, Yang Liu, Ruoming Jin, Dejing Dou, Xintao Wu,, and My T. Thai

PDF

4 Repos

TL;DR

This paper introduces the Heterogeneous Gaussian Mechanism (HGM), a new method for preserving differential privacy in deep neural networks that also enhances robustness against adversarial attacks through innovative noise redistribution and theoretical guarantees.

Contribution

The paper proposes HGM, relaxing privacy constraints and enabling noise redistribution, which improves the robustness and utility of differentially private deep learning models.

Findings

01

HGM provides stronger robustness bounds against adversarial attacks.

02

HGM outperforms baseline methods in robustness evaluations.

03

Theoretical analysis confirms improved privacy-utility trade-off.

Abstract

In this paper, we propose a novel Heterogeneous Gaussian Mechanism (HGM) to preserve differential privacy in deep neural networks, with provable robustness against adversarial examples. We first relax the constraint of the privacy budget in the traditional Gaussian Mechanism from (0, 1] to (0, \infty), with a new bound of the noise scale to preserve differential privacy. The noise in our mechanism can be arbitrarily redistributed, offering a distinctive ability to address the trade-off between model utility and privacy loss. To derive provable robustness, our HGM is applied to inject Gaussian noise into the first hidden layer. Then, a tighter robustness bound is proposed. Theoretical analysis and thorough evaluations show that our mechanism notably improves the robustness of differentially private deep neural networks, compared with baseline approaches, under a variety of model attacks.

Figures19

Click any figure to enlarge with its caption.

Equations105

P r [A (D) = o] \leq e^{ϵ} P r [A (D^{'}) = o] + δ

P r [A (D) = o] \leq e^{ϵ} P r [A (D^{'}) = o] + δ

\theta^{*}=\arg\min_{\theta}\mathbb{E}_{(x,y_{\text{true}})\sim\mathcal{D}}\Big{[}\max_{\lVert\alpha\rVert_{p}\leq\mu}L\big{(}f(x+\alpha,\theta),y_{\text{true}}\big{)}\Big{]}

\theta^{*}=\arg\min_{\theta}\mathbb{E}_{(x,y_{\text{true}})\sim\mathcal{D}}\Big{[}\max_{\lVert\alpha\rVert_{p}\leq\mu}L\big{(}f(x+\alpha,\theta),y_{\text{true}}\big{)}\Big{]}

\forall α \in l_{p} (μ) : f_{k} (x + α) > i : i \neq = k max f_{i} (x + α) \vspace - 2.5 pt

\forall α \in l_{p} (μ) : f_{k} (x + α) > i : i \neq = k max f_{i} (x + α) \vspace - 2.5 pt

\forall k, \forall α \in l_{p} (μ = 1) : E f_{k} (x) \leq e^{ϵ_{r}} E f_{k} (x + α) + δ_{r}

\forall k, \forall α \in l_{p} (μ = 1) : E f_{k} (x) \leq e^{ϵ_{r}} E f_{k} (x + α) + δ_{r}

\forall α \in l_{p} (μ = 1) : \hat{E}_{l b} f_{k} (x) > e^{2 ϵ_{r}} i : i \neq = k max \hat{E}_{u b} f_{i} (x) + (1 + e^{ϵ_{r}}) δ_{r}

\forall α \in l_{p} (μ = 1) : \hat{E}_{l b} f_{k} (x) > e^{2 ϵ_{r}} i : i \neq = k max \hat{E}_{u b} f_{i} (x) + (1 + e^{ϵ_{r}}) δ_{r}

μ_{ma x} = μ \in R^{+} max μ such that \forall α \in l_{p} (μ) :

μ_{ma x} = μ \in R^{+} max μ such that \forall α \in l_{p} (μ) :

\hat{E}_{l b} f_{k} (x) > e^{2 ϵ_{r}} i : i \neq = k max \hat{E}_{u b} f_{i} (x) + (1 + e^{ϵ_{r}}) δ_{r}

σ_{r} = 2 ln (1.25/ δ_{r}) Δ_{p, 2} μ / ϵ_{r} and ϵ_{r} \leq 1

ϵ > 0, σ \geq \frac{2 Δ _{A}}{2 ϵ} (s + s + ϵ), and s = ln (\frac{2}{π} \frac{1}{δ})

ϵ > 0, σ \geq \frac{2 Δ _{A}}{2 ϵ} (s + s + ϵ), and s = ln (\frac{2}{π} \frac{1}{δ})

ϵ > 0, σ \geq \frac{2 Δ _{A}}{2 ϵ} (s + s + ϵ), and s = ln (\frac{2}{π} \frac{1}{δ})

ϵ > 0, σ \geq \frac{2 Δ _{A}}{2 ϵ} (s + s + ϵ), and s = ln (\frac{2}{π} \frac{1}{δ})

Δ_{f} = x, x^{'} : x \neq = x^{'} max \frac{∥ \frac{h _{1} ( x ) - h _{1} ( x ^{'} )}{K r} ∥ _{2}}{∥ x - x ^{'} ∥ _{\infty}} \leq ∥ \frac{W _{1}}{K r} ∥_{\infty, 2}

Δ_{f} = x, x^{'} : x \neq = x^{'} max \frac{∥ \frac{h _{1} ( x ) - h _{1} ( x ^{'} )}{K r} ∥ _{2}}{∥ x - x ^{'} ∥ _{\infty}} \leq ∥ \frac{W _{1}}{K r} ∥_{\infty, 2}

\mathbf{r}=\frac{\mathbf{s}}{\sum_{s_{i}\in\mathbf{s}}s_{i}},\textit{\ where\ }\mathbf{s}=\frac{1}{n}\sum_{x\in D}\Big{|}\frac{\partial L(\theta,x)}{\partial h_{1}(x)}\Big{|}^{\beta}

\mathbf{r}=\frac{\mathbf{s}}{\sum_{s_{i}\in\mathbf{s}}s_{i}},\textit{\ where\ }\mathbf{s}=\frac{1}{n}\sum_{x\in D}\Big{|}\frac{\partial L(\theta,x)}{\partial h_{1}(x)}\Big{|}^{\beta}

h_{1} (x_{i}) = W_{1}^{T} x_{i} + γ

h_{1} (x_{i}) = W_{1}^{T} x_{i} + γ

\widetilde{g}_{t}\leftarrow\frac{1}{m}\Big{(}\sum_{i}\frac{\mathbf{g}_{t}({x}_{i})}{\max(1,\frac{\lVert\mathbf{g}_{t}({x}_{i})^{2}\rVert}{C})}+\mathcal{N}(0,\sigma^{2}C^{2}\mathbf{I})\Big{)}

\widetilde{g}_{t}\leftarrow\frac{1}{m}\Big{(}\sum_{i}\frac{\mathbf{g}_{t}({x}_{i})}{\max(1,\frac{\lVert\mathbf{g}_{t}({x}_{i})^{2}\rVert}{C})}+\mathcal{N}(0,\sigma^{2}C^{2}\mathbf{I})\Big{)}

μ_{ma x} = μ \in R^{+} max μ, such that \forall α \in l_{p} (μ) :

μ_{ma x} = μ \in R^{+} max μ, such that \forall α \in l_{p} (μ) :

\hat{E}_{l b} f_{k} (x) > e^{2 ϵ_{r}} i : i \neq = k max \hat{E}_{u b} f_{i} (x) + (1 + e^{ϵ_{r}}) δ_{r}

σ_{r} = \frac{2}{2 ϵ _{r}} (s + s + ϵ_{r}) Δ_{f} \times μ / ϵ_{r} and ϵ_{r} > 0

conventional accuracy = \frac{\sum _{i = 1}^{∣ t es t ∣} i s C or r ec t ( x _{i} )}{∣ t es t ∣}

conventional accuracy = \frac{\sum _{i = 1}^{∣ t es t ∣} i s C or r ec t ( x _{i} )}{∣ t es t ∣}

certified accuracy = \frac{\sum _{i = 1}^{∣ t es t ∣} i s C or r ec t ( x _{i} ) & i s R o b u s t ( x _{i} )}{∣ t es t ∣}

\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})=\ln\frac{\mathrm{Pr}[\mathcal{M}\big{(}D,A,\sigma\big{)}=\mathbf{o}]}{\mathrm{Pr}[\mathcal{M}\big{(}D^{\prime},A,\sigma\big{)}=\mathbf{o}]}

\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})=\ln\frac{\mathrm{Pr}[\mathcal{M}\big{(}D,A,\sigma\big{)}=\mathbf{o}]}{\mathrm{Pr}[\mathcal{M}\big{(}D^{\prime},A,\sigma\big{)}=\mathbf{o}]}

\begin{split}&|\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})|=\left|\ln\frac{\mathrm{Pr}[A(D)+\mathcal{N}\Big{(}0,\sigma^{2}{\Delta}^{2}_{A}\Big{)}=\mathbf{o}]}{\mathrm{Pr}[A(D^{\prime})+\mathcal{N}\Big{(}0,\sigma^{2}{\Delta}^{2}_{A}\Big{)}=\mathbf{o}]}\right|\\ &=\left|\ln\frac{\prod_{i=1}^{K}\exp\Big{(}-\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\big{(}o_{i}-A(D)_{i}\big{)}^{2}\Big{)}}{\prod_{i=1}^{K}\exp\Big{(}-\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\big{(}o_{i}-A(D)_{i}+v_{i}\big{)}^{2}\Big{)}}\right|\\ &=\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\left|\sum_{i=1}^{K}\big{(}o_{i}-A(D)_{i}\big{)}^{2}-\big{(}o_{i}-A(D)_{i}+v_{i}\big{)}^{2}\right|\\ &=\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\left|\|\mathbf{z}\|^{2}-\|\mathbf{z}+\mathbf{v}\|^{2}\right|\end{split}

\begin{split}&|\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})|=\left|\ln\frac{\mathrm{Pr}[A(D)+\mathcal{N}\Big{(}0,\sigma^{2}{\Delta}^{2}_{A}\Big{)}=\mathbf{o}]}{\mathrm{Pr}[A(D^{\prime})+\mathcal{N}\Big{(}0,\sigma^{2}{\Delta}^{2}_{A}\Big{)}=\mathbf{o}]}\right|\\ &=\left|\ln\frac{\prod_{i=1}^{K}\exp\Big{(}-\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\big{(}o_{i}-A(D)_{i}\big{)}^{2}\Big{)}}{\prod_{i=1}^{K}\exp\Big{(}-\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\big{(}o_{i}-A(D)_{i}+v_{i}\big{)}^{2}\Big{)}}\right|\\ &=\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\left|\sum_{i=1}^{K}\big{(}o_{i}-A(D)_{i}\big{)}^{2}-\big{(}o_{i}-A(D)_{i}+v_{i}\big{)}^{2}\right|\\ &=\frac{1}{2\sigma^{2}{\Delta}^{2}_{A}}\left|\|\mathbf{z}\|^{2}-\|\mathbf{z}+\mathbf{v}\|^{2}\right|\end{split}

∥ z + v ∥^{2} = ∥ v + z_{1}^{'} ∥^{2} + i = 2 \sum K ∥ z_{i}^{'} ∥^{2}, ∥ z ∥^{2} = i = 1 \sum K ∥ z_{i}^{'} ∥^{2}

∥ z + v ∥^{2} = ∥ v + z_{1}^{'} ∥^{2} + i = 2 \sum K ∥ z_{i}^{'} ∥^{2}, ∥ z ∥^{2} = i = 1 \sum K ∥ z_{i}^{'} ∥^{2}

∣ L (o; M, D, D^{'}) ∣ = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ z ∥^{2} - ∥ z + v ∥^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} i = 1 \sum K ∥ z_{i}^{'} ∥^{2} - (∥ z_{1}^{'} + v ∥^{2} + i = 2 \sum K ∥ z_{i}^{'} ∥^{2}) = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ z_{1}^{'} + v ∥^{2} - ∥ z_{1}^{'} ∥^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} (∥ v ∥ + λ_{1})^{2} - λ_{1}^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ v ∥^{2} + 2 λ_{1} ∥ v ∥ \leq \frac{1}{2 σ ^{2} Δ _{A}^{2}} Δ_{A}^{2} + 2∣ λ_{1} ∣ Δ_{A} = \frac{1}{2 σ ^{2}} 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}}

∣ L (o; M, D, D^{'}) ∣ = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ z ∥^{2} - ∥ z + v ∥^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} i = 1 \sum K ∥ z_{i}^{'} ∥^{2} - (∥ z_{1}^{'} + v ∥^{2} + i = 2 \sum K ∥ z_{i}^{'} ∥^{2}) = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ z_{1}^{'} + v ∥^{2} - ∥ z_{1}^{'} ∥^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} (∥ v ∥ + λ_{1})^{2} - λ_{1}^{2} = \frac{1}{2 σ ^{2} Δ _{A}^{2}} ∥ v ∥^{2} + 2 λ_{1} ∥ v ∥ \leq \frac{1}{2 σ ^{2} Δ _{A}^{2}} Δ_{A}^{2} + 2∣ λ_{1} ∣ Δ_{A} = \frac{1}{2 σ ^{2}} 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}}

\Leftrightarrow \Leftrightarrow ∣ L (o; M, D, D^{'}) ∣ \leq \frac{1}{2 σ ^{2}} 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}} \leq ϵ - 2 σ^{2} ϵ \leq 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}} \leq 2 σ^{2} ϵ ∣ λ_{1} ∣ \leq \frac{Δ _{A}}{2} (2 σ^{2} ϵ - 1)

\Leftrightarrow \Leftrightarrow ∣ L (o; M, D, D^{'}) ∣ \leq \frac{1}{2 σ ^{2}} 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}} \leq ϵ - 2 σ^{2} ϵ \leq 1 + \frac{2∣ λ _{1} ∣}{Δ _{A}} \leq 2 σ^{2} ϵ ∣ λ_{1} ∣ \leq \frac{Δ _{A}}{2} (2 σ^{2} ϵ - 1)

Pr (∣ λ_{1} ∣ \leq λ_{ma x}) \geq 1 - δ

Pr (∣ λ_{1} ∣ \leq λ_{ma x}) \geq 1 - δ

Pr (∣ λ_{1} ∣ \leq λ_{ma x}) = 1 - 2 Pr (λ_{1} > λ_{ma x})

Pr (∣ λ_{1} ∣ \leq λ_{ma x}) = 1 - 2 Pr (λ_{1} > λ_{ma x})

\Leftrightarrow Pr (∣ λ_{1} ∣ \leq λ_{ma x}) \geq 1 - δ 1 - 2 Pr (λ_{1} > λ_{ma x}) \geq 1 - δ \Leftrightarrow Pr (λ_{1} > λ_{ma x}) \leq \frac{δ}{2}

\Leftrightarrow Pr (∣ λ_{1} ∣ \leq λ_{ma x}) \geq 1 - δ 1 - 2 Pr (λ_{1} > λ_{ma x}) \geq 1 - δ \Leftrightarrow Pr (λ_{1} > λ_{ma x}) \leq \frac{δ}{2}

\Leftrightarrow \frac{σ Δ _{A}}{2 π t} e^{- \frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \leq \frac{δ}{2} \Leftrightarrow \frac{σ Δ _{A}}{t} e^{- \frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \leq \frac{2 π δ}{2} \frac{t}{σ Δ _{A}} e^{\frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \geq \frac{2}{π} \frac{1}{δ}

\Leftrightarrow \frac{σ Δ _{A}}{2 π t} e^{- \frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \leq \frac{δ}{2} \Leftrightarrow \frac{σ Δ _{A}}{t} e^{- \frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \leq \frac{2 π δ}{2} \frac{t}{σ Δ _{A}} e^{\frac{t ^{2}}{2 σ ^{2} Δ _{A}^{2}}} \geq \frac{2}{π} \frac{1}{δ}

\begin{split}&\frac{t}{\sigma{\Delta}_{A}}e^{\frac{t^{2}}{2\sigma^{2}{\Delta}^{2}_{A}}}\geq\sqrt{\frac{2}{\pi}}\frac{1}{\delta}\Leftrightarrow\frac{2\sigma^{2}\epsilon-1}{2\sigma}e^{\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}}\geq\sqrt{\frac{2}{\pi}}\frac{1}{\delta}\\ &\Leftrightarrow\ln\frac{2\sigma^{2}\epsilon-1}{2\sigma}+\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})\end{split}

\begin{split}&\frac{t}{\sigma{\Delta}_{A}}e^{\frac{t^{2}}{2\sigma^{2}{\Delta}^{2}_{A}}}\geq\sqrt{\frac{2}{\pi}}\frac{1}{\delta}\Leftrightarrow\frac{2\sigma^{2}\epsilon-1}{2\sigma}e^{\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}}\geq\sqrt{\frac{2}{\pi}}\frac{1}{\delta}\\ &\Leftrightarrow\ln\frac{2\sigma^{2}\epsilon-1}{2\sigma}+\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})\end{split}

ln \frac{2 σ ^{2} ϵ - 1}{2 σ} \geq 0 \Leftrightarrow \frac{2 σ ^{2} ϵ - 1}{2 σ} \geq 1 \Leftrightarrow 2 σ^{2} ϵ - 2 σ - 1 \geq 0

ln \frac{2 σ ^{2} ϵ - 1}{2 σ} \geq 0 \Leftrightarrow \frac{2 σ ^{2} ϵ - 1}{2 σ} \geq 1 \Leftrightarrow 2 σ^{2} ϵ - 2 σ - 1 \geq 0

σ \geq \frac{1 + 1 + 2 ϵ}{2 ϵ} (Condition 1) .

σ \geq \frac{1 + 1 + 2 ϵ}{2 ϵ} (Condition 1) .

\begin{split}&\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq s\Leftrightarrow\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq\sqrt{2s}\\ &\Leftrightarrow 2\sigma^{2}\epsilon-2\sigma\sqrt{2s}-1\geq 0\end{split}

\begin{split}&\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq s\Leftrightarrow\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq\sqrt{2s}\\ &\Leftrightarrow 2\sigma^{2}\epsilon-2\sigma\sqrt{2s}-1\geq 0\end{split}

σ \geq \frac{2}{2 ϵ} (s + s + ϵ) (Condition 2)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Heterogeneous Gaussian Mechanism:

Preserving Differential Privacy in Deep Learning with Provable Robustness

NhatHai Phan1111Co-first authors.

Minh Vu5∗

Yang Liu1∗

Ruoming Jin2

Dejing Dou3

Xintao Wu4

My T. Thai5

1New Jersey Institute of Technology, USA; 2Kent State University, USA; 3University of Oregon, USA; 4University of Arkansas, USA; 5University of Florida, USA

{phan,yl558}@njit.edu, {minhvu,mythai}@ufl.edu, [email protected], [email protected], [email protected]

Abstract

In this paper, we propose a novel Heterogeneous Gaussian Mechanism (HGM) to preserve differential privacy in deep neural networks, with provable robustness against adversarial examples. We first relax the constraint of the privacy budget in the traditional Gaussian Mechanism from $(0,1]$ to $(0,\infty)$ , with a new bound of the noise scale to preserve differential privacy. The noise in our mechanism can be arbitrarily redistributed, offering a distinctive ability to address the trade-off between model utility and privacy loss. To derive provable robustness, our HGM is applied to inject Gaussian noise into the first hidden layer. Then, a tighter robustness bound is proposed. Theoretical analysis and thorough evaluations show that our mechanism notably improves the robustness of differentially private deep neural networks, compared with baseline approaches, under a variety of model attacks.

1 Introduction

Recent developments of machine learning (ML) significantly enhance sharing and deploying of ML models in practical applications more than ever before. This presents critical privacy and security issues, when ML models are built on personal data, e.g., clinical records, images, user profiles, etc. In fact, adversaries can conduct: 1) privacy model attacks, in which deployed ML models can be used to reveal sensitive information in the private training data Fredrikson et al. (2015); Wang et al. (2015); Shokri et al. (2017); Papernot et al. (2016); and 2) adversarial example attacks Goodfellow et al. (2014) to cause the models to misclassify. Note that adversarial examples are maliciously perturbed inputs designed to mislead a model at test time Liu et al. (2016); Carlini and Wagner (2017). That poses serious risks to deploy machine learning models in practice. Therefore, it is of paramount significance to simultaneously preserve privacy in the private training data and guarantee the robustness of the model under adversarial examples.

To preserve privacy in the training set, recent efforts have focused on applying Gaussian Mechanism (GM) Dwork and Roth (2014) to preserve differential privacy (DP) in deep learning Abadi et al. (2016); Hamm et al. (2017); Yu et al. (2019); Lee and Kifer (2018). The concept of DP is an elegant formulation of privacy in probabilistic terms, and provides a rigorous protection for an algorithm to avoid leaking personal information contained in its inputs. It is becoming mainstream in many research communities and has been deployed in practice in the private sector and government agencies. DP ensures that the adversary cannot infer any information with high confidence (controlled by a privacy budget $\epsilon$ and a broken probability $\delta$ ) about any specific tuple from the released results. GM is also applied to derive provable robustness against adversarial examples Lecuyer et al. (2018). However, existing efforts only focus on either preserving DP or deriving provable robustness Kolter and Wong (2017); Raghunathan et al. (2018), but not both DP and robustness!

With the current form of GM Dwork and Roth (2014) applied in existing works Abadi et al. (2016); Hamm et al. (2017); Lecuyer et al. (2018), it is challenging to preserve DP in order to protect the training data, with provable robustness. In GM, random noise scaled to $\mathcal{N}(0,\sigma^{2})$ is injected into each of the components of an algorithm output, where the noise scale $\sigma$ is a function of $\epsilon$ , $\delta$ , and the mechanism sensitivity $\Delta$ . In fact, there are three major limitations in these works when applying GM: (1) The privacy budget $\epsilon$ in GM is restricted to $(0,1]$ , resulting in a limited search space to optimize the model utility and robustness bounds; (2) All the features (components) are treated the same in terms of the amount of noise injected. That may not be optimal in real-world scenarios Bach et al. (2015); Phan et al. (2017); and (3) Existing works have not been designed to defend against adversarial examples, while preserving differential privacy in order to protect the training data. These limitations do narrow the applicability of GM, DP, deep learning, and provable robustness, by affecting the model utility, flexibility, reliability, and resilience to model attacks in practice.

Our Contributions. To address these issues, we first propose a novel Heterogeneous Gaussian Mechanism (HGM), in which (1) the constraint of $\epsilon$ is extended from $(0,1]$ to $(0,\infty)$ ; (2) a new lower bound of the noise scale $\sigma$ will be presented; and more importantly, (3) the magnitude of noise can be heterogeneously injected into each of the features or components. These significant extensions offer a distinctive ability to address the trade-off among model utility, privacy loss, and robustness by redistributing the noise and enlarging the search space for better defensive solutions.

Second, we develop a novel approach, called Secure-SGD, to achieve both DP and robustness in the general scenario, i.e., any value of the privacy budget $\epsilon$ . In Secure-SGD, our HGM is applied to inject Gaussian noise into the first hidden layer of a deep neural network. This noise is used to derive a tighter and provable robustness bound. Then, DP stochastic gradient descent (DPSGD) algorithm Abadi et al. (2016) is applied to learn differentially private model parameters. The training process of our mechanism preserves DP in deep neural networks to protect the training data with provable robustness. To our knowledge, Secure-SGD is the first approach to learn such a secure model with a high utility. Rigorous experiments conducted on MNIST and CIFAR-10 datasets Lecun et al. (1998); Krizhevsky and Hinton (2009) show that our approach significantly improves the robustness of DP deep neural networks, compared with baseline approaches.

2 Preliminaries and Related Work

In this section, we revisit differential privacy, PixelDP Lecuyer et al. (2018), and introduce our problem definition. Let $D$ be a database that contains $n$ tuples, each of which contains data $x\in[-1,1]^{d}$ and a ground-truth label $y\in\mathbb{Z}_{K}$ . Let us consider a classification task with $K$ possible categorical outcomes; i.e., the data label $y$ given $x\in D$ is assigned to only one of the $K$ categories. Each $y$ can be considered as a one-hot vector of $K$ categories $y=\{y_{1},\ldots,y_{K}\}$ . On input $x$ and parameters $\theta$ , a model outputs class scores $f:\mathbb{R}^{d}\rightarrow\mathbb{R}^{K}$ that maps $d$ -dimentional inputs $x$ to a vector of scores $f(x)=\{f_{1}(x),\ldots,f_{K}(x)\}$ s.t. $\forall k:f_{k}(x)\in[0,1]$ and $\sum_{k=1}^{K}f_{k}(x)=1$ . The class with the highest score value is selected as the predicted label for the data tuple, denoted as $y(x)=\max_{k\in K}f_{k}(x)$ . We specify a loss function $L(f(x),y)$ that represents the penalty for mismatching between the predicted values $f(x)$ and original values $y$ .

Differential Privacy. The definitions of differential privacy and Gaussian Mechanism are as follows:

Definition 1

$(\epsilon,\delta)$ -Differential Privacy Dwork et al. (2006). A randomized algorithm $A$ fulfills $(\epsilon,\delta)$ -differential privacy, if for any two databases $D$ and $D^{\prime}$ differing at most one tuple, and for all $\mathbf{o}\subseteq Range(A)$ , we have:

[TABLE]

Smaller $\epsilon$ and $\delta$ enforce a stronger privacy guarantee.

Here, $\epsilon$ controls the amount by which the distributions induced by $D$ and $D^{\prime}$ may differ, and $\delta$ is a broken probability. DP also applies to general metrics $\rho(D,D^{\prime})\leq 1$ , including Hamming metric as in Definition 1 and $l_{p\in\{1,2,\infty\}}$ -norms Chatzikokolakis et al. (2013). Gaussian Mechanism is applied to achieve DP given a random algorithm $A$ as follows:

Theorem 1

Gaussian Mechanism Dwork and Roth (2014). Let $A:\mathbb{R}^{d}\rightarrow\mathbb{R}^{K}$ be an arbitrary $K$ -dimensional function, and define its $l_{2}$ sensitivity to be $\Delta_{A}=\max_{D,D^{\prime}}\lVert A(D)-A(D^{\prime})\rVert_{2}$ . The Gaussian Mechanism with parameter $\sigma$ adds noise scaled to $\mathcal{N}(0,\sigma^{2})$ to each of the $K$ components of the output. Given $\epsilon\in(0,1]$ , the Gaussian Mechanism with $\sigma\geq\sqrt{2\ln(1.25/\delta)}\Delta_{A}/\epsilon$ is $(\epsilon,\delta)$ -DP.

Adversarial Examples. For some target model $f$ and inputs $(x,y_{\text{true}})$ , i.e., $y_{true}$ is the true label of $x$ , one of the adversary’s goals is to find an adversarial example $x^{\text{adv}}=x+\alpha$ , where $\alpha$ is the perturbation introduced by the attacker, such that: (1) $x^{\text{adv}}$ and $x$ are close, and (2) the model misclassifies $x^{\text{adv}}$ , i.e., $y(x^{\text{adv}})\neq y(x)$ . In this paper, we consider well-known classes of $l_{p\in\{1,2,\infty\}}$ -norm bounded attacks Goodfellow et al. (2014). Let $l_{p}(\mu)=\{\alpha\in\mathbb{R}^{d}:\lVert\alpha\rVert_{p}\leq\mu\}$ be the $l_{p}$ -norm ball of radius $\mu$ . One of the goals in adversarial learning is to minimize the risk over adversarial examples:

[TABLE]

where a specific attack is used to approximate solutions to the inner maximization problem, and the outer minimization problem corresponds to training the model $f$ with parameters $\theta$ over these adversarial examples $x^{\text{adv}}=x+\alpha$ .

We revisit two basic attacks in this paper. The first one is a single-step algorithm, in which only a single gradient computation is required. For instance, Fast Gradient Sign Method (FGSM) algorithm Goodfellow et al. (2014) finds an adversarial example by maximizing the loss function $L(f(x^{\text{adv}},\theta),y_{\text{true}})$ . The second one is an iterative algorithm, in which multiple gradients are computed and updated. For instance, in Kurakin et al. (2016), FGSM is applied multiple times with small steps, each of which has a size of $\mu/T_{\mu}$ , where $T_{\mu}$ is the number of steps.

Provable Robustness and PixelDP. In this paper, we consider the following robustness definition. Given a benign example $x$ , we focus on achieving a robustness condition to attacks of $l_{p}(\mu)$ -norm, as follows:

[TABLE]

where $k$ = $y(x)$ , indicating that a small perturbation $\alpha$ in the input does not change the predicted label $y(x)$ .

To achieve the robustness condition in Eq. 2, Lecuyer et al. (2018) introduce an algorithm, called PixelDP. By considering an input $x$ (e.g., images) as databases in DP parlance, and individual features (e.g., pixels) as tuples in DP, PixelDP shows that randomizing the scoring function $f(x)$ to enforce DP on a small number of pixels in an image guarantees robustness of predictions against adversarial examples that can change up to that number of pixels. To achieve the goal, noise $\mathcal{N}(0,\sigma^{2}_{r})$ is injected into either input $x$ or some hidden layer of a deep neural network. That results in the following $(\epsilon_{r},\delta_{r})$ -PixelDP condition, with a budget $\epsilon_{r}$ and a broken brobability $\delta_{r}$ of robustness, as follows:

Lemma 1

$(\epsilon_{r},\delta_{r})$ -PixelDP Lecuyer et al. (2018). Given a randomized scoring function $f(x)$ satisfying $(\epsilon_{r},\delta_{r})$ -PixelDP w.r.t. a $l_{p}$ -norm metric, we have:

[TABLE]

where $\mathbb{E}f_{k}(x)$ is the expected value of $f_{k}(x)$ .

The network is trained by applying typical optimizers, such as SGD. At the prediction time, a certified robustness check is implemented for each prediction. A generalized robustness condition is proposed as follows:

[TABLE]

where $\hat{\mathbb{E}}_{lb}$ and $\hat{\mathbb{E}}_{ub}$ are the lower bound and upper bound of the expected value $\hat{\mathbb{E}}f(x)=\frac{1}{N}\sum_{N}f(x)_{N}$ , derived from the Monte Carlo estimation with an $\eta$ -confidence, given $N$ is the number of invocations of $f(x)$ with independent draws in the noise $\sigma_{r}$ . Passing the check for a given input $x$ guarantees that no perturbation exists up to $l_{p}(\mu=1)$ -norm that causes the model to change its prediction result. In other words, the classification model, based on $\hat{\mathbb{E}}f(x)$ , i.e., $\arg\max_{k}\hat{\mathbb{E}}f_{k}(x)$ , is consistent to attacks of $l_{p}(\mu=1)$ -norm on $x$ with probability $\geq\eta$ . Group privacy Dwork et al. (2006) can be applied to achieve the same robustness condition, given a particular size of perturbation $l_{p}(\mu)$ . For a given $\sigma_{r}$ , $\delta_{r}$ , and sensitivity $\Delta_{p,2}$ used at prediction time, PixelDP solves for the maximum $\mu$ for which the robustness condition in Eq. 4 checks out:

[TABLE]

3 Heterogeneous Gaussian Mechanism

We now formally present our Heterogeneous Gaussian Mechanism (HGM) and the Secure-SGD algorithm. In Eq. 5, it is clear that $\epsilon$ is restricted to be $(0,1]$ , following the Gaussian Mechanism (Theorem 1). That affects the robustness bound in terms of flexibility, reliability, and utility. In fact, adversaries only need to guarantee that $\hat{\mathbb{E}}_{lb}f_{k}(x+\alpha)$ is larger than at most $e^{2}\max_{i:i\neq k}\hat{\mathbb{E}}_{ub}f_{i}(x+\alpha)+(1+e)\delta$ , i.e., $\epsilon_{r}=1$ , in order to assault the robustness condition: thus, softening the robustness bound. In addition, the search space for the robustness bound $\mu_{max}$ is limited, given $\epsilon\in(0,1]$ . These issues increase the number of robustness violations, potentially degrading the utility and reliability of the robustness bound. In real-world applications, such as healthcare, autonomous driving, object recognition, etc., a flexible value of $\epsilon_{r}$ is needed to implement stronger and more practical robustness bounds. This is also true for many other algorithms applying Gaussian Mechanism Dwork and Roth (2014).

To relax this constraint, we introduce an Extended Gaussian Mechanism as follows:

Theorem 2

Extended Gaussian Mechanism. Let $A:\mathbb{R}^{d}\rightarrow\mathbb{R}^{K}$ be an arbitrary $K$ -dimensional function, and define its $l_{2}$ sensitivity to be $\Delta_{A}=\max_{D,D^{\prime}}\lVert A(D)-A(D^{\prime})\rVert_{2}$ . An Extended Gaussian Mechanism $M$ with parameter $\sigma$ adds noise scaled to $\mathcal{N}(0,\sigma^{2})$ to each of the $K$ components of the output. The mechanism $M$ is $(\epsilon,\delta)$ -DP, with

[TABLE]

Detailed proof of Theorem 2 is in Appendix A222https://www.dropbox.com/s/mjkq4zqqh6ifqir/HGM_Appendix.pdf?dl=0. The Extended Gaussian Mechanism enables us to relax the constraint of $\epsilon$ . However, the noise scale $\sigma$ is used to inject Gaussian noise into each component. This may not be optimal, since different components usually have different impacts to the model outcomes Bach et al. (2015). To address this, we further propose a Heterogeneous Gaussian Mechanism (HGM), in which the noise scale $\sigma$ in Theorem 2 can be arbitrarily redistributed. Different strategies can be applied to improve the model utility and to enrich the search space for better robustness bounds. For instance, more noise will be injected into less important components, or vice-versa, or even randomly redistributed. In order to achieve our goal, we introduce a noise redistribution vector $K\mathbf{r}$ , where $\mathbf{r}\in\mathbb{R}^{K}$ that satisfies $0\leq r_{i}\leq 1~{}(i\in[K])$ and $\sum_{i=1}^{K}r_{i}=1$ . We show that by injecting Gaussian noise $\mathcal{N}\big{(}0,\sigma^{2}K\mathbf{r}\big{)}$ , where $\Delta_{A}=\max_{D,D^{\prime}}\sqrt{\sum_{k=1}^{K}\frac{1}{Kr_{k}}\big{(}A(D)_{k}-A(D^{\prime})_{k}\big{)}^{2}}$ and $\rho(D,D^{\prime})\leq 1$ , we achieve $(\epsilon,\delta)$ -DP.

Theorem 3

Heterogeneous Gaussian Mechanism. Let $A:\mathbb{R}^{d}\rightarrow\mathbb{R}^{K}$ be an arbitrary $K$ -dimensional function, and define its $l_{2}$ sensitivity to be $\Delta_{A}=\max_{D,D^{\prime}}\lVert\frac{A(D)-A(D^{\prime})}{\sqrt{K\mathbf{r}}}\rVert_{2}=\max_{D,D^{\prime}}\sqrt{\sum_{k=1}^{K}\frac{1}{Kr_{k}}\big{(}A(D)_{k}-A(D^{\prime})_{k}\big{)}^{2}}$ . A Heterogeneous Gaussian Mechanism $M$ with parameter $\sigma$ adds noise scaled to $\mathcal{N}(0,\sigma^{2}K\mathbf{r})$ to each of the $K$ components of the output. The mechanism $M$ is $(\epsilon,\delta)$ -DP, with

[TABLE]

where $\mathbf{r}\in\mathbb{R}^{K}$ s.t. $0\leq r_{i}\leq 1~{}(i\in[K])$ and $\sum_{i=1}^{K}r_{i}=1$ .

Detailed proof of Theorem 3 is in Appendix B1. It is clear that the Extended Gaussian Mechanism is a special case of the HGM, when $\forall i\in[K]:r_{i}=1/K$ . Figure 1 illustrates the magnitude of noise injected by the traditional Gaussian Mechanism, the state-of-the-art Analytic Gaussian Mechanism Balle and Wang (2018), and our Heterogeneous Gaussian Mechanism as a function of $\epsilon$ , given the global sensitivity $\Delta_{A}=1$ , and $\delta=1e-5$ (a very tight broken probability), and $\forall i\in[K]:r_{i}=1/K$ . The lower bound of the noise scale in our HGM is just a little bit better than the traditional Gaussian Mechanism when $\epsilon\leq 1$ . However, our mechanism does not have the constraint $(0,1]$ on the privacy budget $\epsilon$ . The Analytic Gaussian Mechanism Balle and Wang (2018), which provides the state-of-the-art noise bound, has a better noise scale than our mechanism. However, our noise scale bound provides a distinctive ability to redistribute the noise via the vector $K\mathbf{r}$ , compared with the Analytic Gaussian Mechanism. There could be numerous strategies to identify vector $\mathbf{r}$ . This is significant when addressing the trade-off between model utility and privacy loss or robustness in real-world applications. In our mechanism, “more noise” is injected into “more vulnerable” components to improve the robustness. We will show how to compute vector $\mathbf{r}$ and identify vulnerable components in our Secure-SGD algorithm. Experimental results illustrate that, by redistributing the noise, our HGM yields better robustness, compared with existing mechanisms.

4 Secure-SGD

In this section, we focus on applying our HGM in a crucial and emergent application, which is enhancing the robustness of differentially private deep neural networks. Given a deep neural network $f$ , DPSGD algorithm Abadi et al. (2016) is applied to learn $(\epsilon,\delta)$ -DP parameters $\theta$ . Then, by injecting Gaussian noise into the first hidden layer, we can leverage the robustness concept of PixelDP Lecuyer et al. (2018) (Eq. 5) to derive a better robustness bound based on our HGM.

Algorithm 1 outlines the key steps in our Secure-SGD algorithm. We first initiate the parameters $\theta$ and construct a deep neural network $f:\mathbb{R}^{d}\rightarrow\mathbb{R}^{K}$ (Lines 1-2). Then, a robustness noise $\gamma\leftarrow\mathcal{N}(0,\sigma_{r}^{2}K\mathbf{r})$ is drawn by applying our HGM (Line 3), where $\sigma_{r}$ is computed following Theorem 3, $K$ is the number of hidden neurons in $h_{1}$ , denoted as $K=|h_{1}|$ , and $\Delta_{f}$ is the sensitivity of the algorithm, defined as the maximum change in the output (i.e., which is $h_{1}(x)=W^{T}_{1}x$ ) that can be generated by the perturbation in the input $x$ under the noise redistribution vector $K\mathbf{r}$ .

[TABLE]

For $l_{\infty}$ -norm attacks, we use the following bound $\Delta_{f}=\sqrt{|h_{1}|}\lVert\frac{W_{1}}{K\mathbf{r}}\rVert_{\infty}$ , where $\lVert\frac{W_{1}}{K\mathbf{r}}\rVert_{\infty}$ is the maximum 1-norm of $W_{1}$ ’s rows over the vector $K\mathbf{r}$ . The vector $\mathbf{r}$ can be computed as the forward derivative of $h_{1}(x)$ as follows:

[TABLE]

where $\beta$ is a user-predefined inflation rate. It is clear that features, which have higher forward derivative values, will be more vulnerable to attacks by maximizing the loss function $L(\theta,x)$ . These features are assigned larger values in vector $\mathbf{r}$ , resulting in more noise injected, and vice-versa. The computation of $\mathbf{r}$ can be considered as a prepossessing step using a pre-trained model. It is important to note that the utilizing of $\mathbf{r}$ does not risk any privacy leakage, since $\mathbf{r}$ is only applied to derive provable robustness. It does not have any effect on the DP-preserving procedure in our algorithm, as follows. First, at each training step $t\in T$ , our mechanism takes a random sample $B_{t}$ from the data $D$ , with sampling probability $m/n$ , where $m$ is a batch size (Line 5). For each tuple ${x}_{i}\in B_{t}$ , the first hidden layer is perturbed by adding Gaussian noise derived from our HGM (Line 6, Alg. 1):

[TABLE]

This ensures that the scoring function $f(x)$ satisfies $(\epsilon_{r},\delta_{r})$ -PixelDP (Lemma 3). Then, the gradient $\mathbf{g}_{t}({x}_{i})=\nabla_{\theta_{t}}{L}(\boldsymbol{\theta}_{t},{x}_{i})$ is computed (Lines 7-9). The gradients will be bounded by clipping each gradient in $l_{2}$ norm; i.e., the gradient vector $\mathbf{g}_{t}(x_{i})$ is replaced by $\mathbf{g}_{t}(x_{i})/\max(1,\lVert\mathbf{g}_{t}(x_{i})\rVert_{2}/C)$ for a predefined threshold $C$ (Lines 10-12). Uniformed normal distribution noise is added into gradients of parameters $\boldsymbol{\theta}$ (Line 14), as:

[TABLE]

The descent of the parameters explicitly is as: $\boldsymbol{\theta}_{t+1}\leftarrow\boldsymbol{\theta}_{t}-\xi_{t}\widetilde{g}_{t}$ , where $\xi_{t}$ is a learning rate at the step $t$ (Line 16). The training process of our mechanism achieves both $(\epsilon,\delta)$ -DP to protect the training data and provable robustness with the budgets $(\epsilon_{r},\delta_{r})$ . In the verified testing phase (Lines 17-22), by applying HGM and PixelDP, we derive a novel robustness bound $\mu_{max}$ for a specific input $x$ as follows:

[TABLE]

where $\hat{\mathbb{E}}_{lb}$ and $\hat{\mathbb{E}}_{ub}$ are the lower and upper bounds of the expected value $\hat{\mathbb{E}}f(x)=\frac{1}{N}\sum_{N}f(x)_{N}$ , derived from the Monte Carlo estimation with an $\eta$ -confidence, given $N$ is the number of invocations of $f(x)$ with independent draws in the noise $\gamma\leftarrow\mathcal{N}(0,\sigma_{r}^{2}K\mathbf{r})$ . Similar to Lecuyer et al. (2018), we use Hoeffding’s inequality Hoeffding (1963) to bound the error in $\hat{\mathbb{E}}f(x)$ . If the robustness size $\mu_{max}$ is larger than a given adversarial perturbation size $\mu_{a}$ , the model prediction is considered consistent to that attack size. Given the relaxed budget $\epsilon_{r}>0$ and the noise redistribution $K\mathbf{r}$ , the search space for the robustness size $\mu_{max}$ is significantly enriched, e.g., $\epsilon_{r}>1$ , strengthening the robustness bound. Note that vector $\mathbf{r}$ can also be randomly drawn in the estimation of the expected value $\hat{\mathbb{E}}f(x)$ . Both fully-connected and convolution layers can be applied. Given a convolution layer, we need to ensure that the computation of each feature map is $(\epsilon_{r},\delta_{r})$ -PixelDP, since each of them is independently computed by reading a local region of input neurons. Therefore, the sensitivity $\Delta_{f}$ can be considered the upper-bound sensitivity given any single feature map. Our algorithm is the first effort to connect DP preservation in order to protect the original training data and provable robustness in deep learning.

5 Experimental Results

We have carried out extensive experiments on two benchmark datasets, MNIST and CIFAR-10. Our goal is to evaluate whether our HGM significantly improves the robustness of both differentially private and non-private models under strong adversarial attacks, and whether our Secure-SGD approach retains better model utility compared with baseline mechanisms, under the same DP guarantees and protections.

Baseline Approaches. Our HGM and two approaches, including HGM_PixelDP and Secure-SGD, are evaluated in comparison with state-of-the-art mechanisms in: (1) DP-preserving algorithms in deep learning, i.e., DPSGD Abadi et al. (2016), AdLM Phan et al. (2017); in (2) Provable robustness, i.e., PixelDP Lecuyer et al. (2018); and (3) The Analytic Gaussian Mechanism (AGM) Balle and Wang (2018). To preserve DP, DPSGD injects random noise into gradients of parameters, while AdLM is a Functional Mechanism-based approach. PixelDP is one of the state-of-the-art mechanisms providing provable robustness using DP bounds. Our HGM_PixelDP model simply is PixelDP with the noise bound derived from our HGM. The baseline models share the same design in our experiment. We consider the class of $l_{\infty}$ -bounded adversaries. Four white-box attack algorithms were used, including FGSM, I-FGSM, Momentum Iterative Method (MIM) Dong et al. (2017), and MadryEtAl Madry et al. (2018), to draft adversarial examples $l_{\infty}(\mu_{a})$ .

MNIST: We used two convolution layers (32 and 64 features). Each hidden neuron connects with a 5x5 unit patch. A fully-connected layer has 256 units. The batch size $m$ was set to 128, $\xi=1.5$ , $\psi=2$ , $T_{\mu}=10$ , and $\beta=1$ . CIFAR-10: We used three convolution layers (128, 128, and 256 features). Each hidden neuron connects with a 3x3 unit patch in the first layer, and a 5x5 unit patch in other layers. One fully-connected layer has 256 neurons. The batch size $m$ was set to 128, $\xi=1.5$ , $\psi=10$ , $T_{\mu}=3$ , and $\beta=1$ . Note that $\boldsymbol{\epsilon}$ is used to indicate the DP budget used to protect the training data; meanwhile, $\boldsymbol{\epsilon_{r}}$ is the budget for robustness. The implementation of our mechanism is available in TensorFlow333https://github.com/haiphanNJIT/SecureSGD. We apply two accuracy metrics as follows:

[TABLE]

where $|test|$ is the number of test cases, $isCorrect(\cdot)$ returns $1$ if the model makes a correct prediction (otherwise, returns 0), and $isRobust(\cdot)$ returns $1$ if the robustness size is larger than a given attack bound $\mu_{a}$ (otherwise, returns 0).

HGM_PixelDP. Figures 2 and 3 illustrate the certified accuracy under attacks of each model as a function of the adversarial perturbation $\mu_{a}$ . Our HGM_PixelDP notably outperforms the PixelDP model in most of the cases given the CIFAR-10 dataset. We register an improvement of 8.63% on average when $\epsilon_{r}=8$ compared with the PixelDP, i.e., $p<8.14e-7$ (2 tail t-test). This clearly shows the effectiveness of our HGM in enhancing the robustness against adversarial examples. Regarding the MNIST data, our HGM_PixelDP model achieves better certified accuracies when $\mu\leq 0.3$ compared with the PixelDP model. On average, our HGM_PixelDP ( $\epsilon_{r}=4$ ) improves 4.17% in terms of certified accuracy given $\mu_{a}\leq 0.3$ , compared with the PixelDP, $p<5.89e-3$ (2 tail t-test). Given very strong adversarial perturbation $\mu_{a}>0.3$ , smaller $\epsilon_{r}$ usually yields better results, offering the flexibility in choosing appropriate DP budget $\epsilon_{r}$ for robustness given different attack magnitudes. These experimental results clearly show crucial benefits of relaxing the constraints of the privacy budget and of the heterogeneous noise distribution in our HGM.

Secure-SGD. The application of our HGM in DP-preserving deep neural networks, i.e., Secure-SGD, further strengthens our observations. Figures 4 and 5 illustrate the certified accuracy under attacks of each model as a function of the privacy budget $\epsilon$ used to protect the training data. By incorporating HGM into DPSGD, our Secure-SGD remarkably increases the robustness of differentially private deep neural networks. In fact, our Secure-SGD with HGM outmatches DGSGP, AdLM, and the application of AGM in our Secure-SGD algorithm in most of the cases. Note that the application of AGM in our Secure-SGD does not redistribute the noise in deriving the provable robustness. In CIFAR-10 dataset, our Secure-SGD ( $\epsilon_{r}=8$ ) correspondingly acquires a 2.7% gain ( $p<1.22e-6$ , 2 tail t-test), a 3.8% gain ( $p<2.16e-6$ , 2 tail t-test), and a 17.75% gain ( $p<2.05e-10$ , 2 tail t-test) in terms of conventional accuracy, compared with AGM in Secure-SGD, DPSGD, and AdLM algorithms. We register the same phenomenon in the MNIST dataset. On average, our Secure-GSD ( $\epsilon_{r}=4$ ) correspondingly outperforms the AGM in Secure-SGD and DPSGD with an improvement of 2.9% ( $p<8.79e-7$ , 2 tail t-test) and an improvement of 10.74% ( $p<8.54e-14$ , 2 tail t-test).

Privacy Preserving and Provable Robustness. We also discover an original, interesting, and crucial trade-off between DP preserving to protect the training data and the provable robustness (Figures 4 and 5). Given our Secure-SGD model, there is a huge improvement in terms of conventional accuracy when the privacy budget $\epsilon$ increases from 0.2 to 2 in MNIST dataset (i.e., 29.67% on average), and from 2 to 10 in CIFAR-10 dataset (i.e., 18.17% on average). This opens a long-term research avenue to achieve better provable robustness under strong privacy guarantees, since with strong privacy guarantees (i.e., small values of $\epsilon$ ), the conventional accuracies of all models are still modest.

6 Conclusion

In this paper, we presented a Heterogeneous Gaussian Mechanism (HGM) to relax the privacy budget constraint, i.e., from $(0,1]$ to $(0,\infty)$ , and its heterogeneous noise bound. An original application of our HGM in DP-preserving mechanism with provable robustness was designed to enhance the robustness of DP deep neural networks, by introducing a novel Secure-SGD algorithm with a better robustness bound. Our model shows promising results and opens a long-term avenue to address the trade-off between DP preservation and provable robustness. In future work, we will learn how to identify and incorporate more practical Gaussian noise distributions to further improve the model accuracies under model attacks.

Acknowledgement

This work is partially supported by grants DTRA HDTRA1-14-1-0055, NSF CNS-1850094, NSF CNS-1747798, NSF IIS-1502273, and NJIT Seed Grant.

Appendix A Proof of Theorem 2

Proof 1

The privacy loss of the Extended Gaussian Mechanism incurred by observing an output $\mathbf{o}$ is defined as:

[TABLE]

Given $\mathbf{v}=A(D)-A(D^{\prime})$ , we have that

[TABLE]

where $\mathbf{z}=\{z_{i}=o_{i}-A(D)_{i}\}_{i\in[1,K]}$ .

Since $\mathbf{o}-A(D)\sim\mathcal{N}\big{(}0,\sigma^{2}{\Delta}^{2}_{A}\big{)}$ , then $\mathbf{z}\sim\mathcal{N}\big{(}0,\sigma^{2}{\Delta}^{2}_{A})\big{)}$ . Now we will use the fact that the distribution of a spherically symmetric normal is independent of the orthogonal basis, from which its constituent normals are drawn. Then, we work in a basis that is aligned with $\mathbf{v}$ .

Let $\mathbf{b}_{1},\dots,\mathbf{b}_{K}$ be a basis that satisfies $\|\mathbf{b}_{i}\|=1~{}(i\in[1,K])$ and $\mathbf{b}_{i}\cdot\mathbf{b}_{i^{\prime}}=0~{}(i,i^{\prime}\in[1,K],i\neq i^{\prime})$ . Fix such a basis $\mathbf{b}_{1},\dots,\mathbf{b}_{K}$ , we draw $\mathbf{z}$ by first drawing signed lengths $\lambda_{i}\sim\mathcal{N}\big{(}0,\sigma^{2}{\Delta}^{2}_{A}\big{)}~{}(i\in[1,K])$ . Then, let $\mathbf{z}^{\prime}_{i}=\lambda_{i}\mathbf{b}_{i}$ and $\mathbf{z}=\sum_{i=1}^{K}\mathbf{z}^{\prime}_{i}$ . Without loss of generality, let us assume that $\mathbf{b}_{1}$ is parallel to $\mathbf{v}$ . Consider that the triangle with base $\mathbf{v}+\mathbf{z}^{\prime}_{1}$ and the edge $\sum_{i=2}^{K}\mathbf{z}^{\prime}_{i}$ is orthogonal to $\mathbf{v}$ . The hypotenuse of this triangle is $\mathbf{z}+\mathbf{v}$ (Figure 6). Then we have

[TABLE]

Since $\mathbf{v}$ is parallel to $\mathbf{z}^{\prime}_{1}$ , we have $\|\mathbf{v}+\mathbf{z}^{\prime}_{1}\|^{2}=(\|\mathbf{v}\|+\lambda_{1})^{2}$ . Then we have

[TABLE]

By bounding the privacy loss by $\epsilon~{}(\epsilon>0)$ , we have

[TABLE]

Let $\lambda_{max}=\frac{{\Delta}_{A}}{2}(2\sigma^{2}\epsilon-1)$ . To ensure the privacy loss is bounded by $\epsilon$ with probability at least $1-\delta$ , we require

[TABLE]

Recall that $\lambda_{1}\sim\mathcal{N}(0,\sigma^{2}{\Delta}^{2}_{A})$ , we have that

[TABLE]

Then, we have

[TABLE]

Next we will use the tail bound: $\mathrm{Pr}(\lambda_{1}>t)\leq\frac{\sigma{\Delta}_{A}}{\sqrt{2\pi}}e^{-\frac{t^{2}}{2\sigma^{2}{\Delta}^{2}_{A}}}$ . We require:

[TABLE]

Taking $t=\lambda_{max}=\frac{{\Delta}_{A}}{2}(2\sigma^{2}\epsilon-1)$ , we have that

[TABLE]

We will ensure the above inequality by requiring: (1) $\ln\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq 0$ , and (2) $\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})$ .

[TABLE]

We can ensure this inequality (Eq. 15) by setting:

[TABLE]

Let $s=\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})$ . If $s<0$ , the second requirement will always be satisfied, and we only need to choose $\sigma$ satisfying the Condition 1. When $s\geq 0$ , since we already ensure $\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq 1$ , we have that

[TABLE]

We can ensure the above inequality by choosing:

[TABLE]

Based on the proof above, now we know that to ensure the privacy loss $|\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})|$ bounded by $\epsilon$ with probability at least $1-\delta$ , we require:

[TABLE]

To compare Condition 2 and Condition 1, we have that

[TABLE]

Since $\delta$ usually is a very small number, i.e., $(1e$ - $5\ll 0.48)$ , without loss of generality, we can assume that Condition 2 always implies Condition 1 in practice. To ensure the privacy loss bounded by $\epsilon$ with probability at least $1-\delta$ , only Condition 2 needs to be satisfied:

[TABLE]

In this proof, the noise $\mathcal{N}(0,\sigma^{2}\Delta_{A}^{2})$ is injected into the model. If we set $\sigma\geq\frac{\sqrt{2}\Delta_{A}}{2\epsilon}(\sqrt{s}+\sqrt{s+\epsilon})$ , then the noise becomes $\mathcal{N}(0,\sigma^{2})$ . Consequently, Theorem 2 does hold.

Appendix B Proof of Theorem 3

Proof 2

The privacy loss of the Heterogeneous Gaussian Mechanism incurred by observing an output $\mathbf{o}$ is defined as:

[TABLE]

Given $\mathbf{v}=A(D)-A(D^{\prime})$ , we have that

[TABLE]

Let $\mathbf{z}$ be a $K$ -dimensional vector that satisfies $z_{k}=\frac{o_{k}-A(D)_{k}}{\sqrt{Kr_{k}}}~{}(k\in[K]).$ Let $\mathbf{v}^{\prime}$ be a $K$ -dimensional vector that satisfies $v^{\prime}_{k}=\frac{v_{k}}{\sqrt{Kr_{k}}}~{}(k\in[K])$ . Then we have that

[TABLE]

Since $\mathbf{o}-A(D)\sim\mathcal{N}\big{(}0,\sigma^{2}{\Delta}^{2}_{A}K\mathbf{r}\big{)}$ , then $\mathbf{z}\sim\mathcal{N}\big{(}0,\sigma^{2}\Delta^{2}_{A}\big{)}$ . Now we will use the fact that the distribution of a spherically symmetric normal is independent of the orthogonal basis from which its constituent normals are drawn. Then, we work in a basis that is aligned with $\mathbf{v}^{\prime}$ .

Let $\mathbf{b}_{1},\dots,\mathbf{b}_{K}$ be a basis that satisfies $\|\mathbf{b}_{k}\|=1~{}(k\in[K])$ and $\mathbf{b}_{k}\cdot\mathbf{b}_{k^{\prime}}=0~{}(k,k^{\prime}\in[K],k\neq k^{\prime})$ . Fix such a basis $\mathbf{b}_{1},\dots,\mathbf{b}_{K}$ , we draw $\mathbf{z}$ by first drawing signed lengths $\lambda_{k}\sim\mathcal{N}\big{(}0,\sigma^{2}{\Delta}^{2}_{A}\big{)}~{}(k\in[K])$ . Then, let $\mathbf{z}^{\prime}_{k}=\lambda_{k}\mathbf{b}_{k}$ , and finally let $\mathbf{z}=\sum_{k=1}^{K}\mathbf{z}^{\prime}_{i}$ . Assume without loss of generality that $\mathbf{b}_{1}$ is parallel to $\mathbf{v}^{\prime}$ . Consider that the right triangle with base $\mathbf{v}+\mathbf{z}^{\prime}_{1}$ and edge $\sum_{k=2}^{K}\mathbf{z}^{\prime}_{i}$ orthogonal to $\mathbf{v}$ . The hypotenuse of this triangle is $\mathbf{z}+\mathbf{v}$ (Figure 6). Then we have

[TABLE]

Since $\mathbf{v}$ is parallel to $\mathbf{z}^{\prime}_{1}$ , we have $\|\mathbf{z}^{\prime}_{1}+\mathbf{v}\|^{2}=(\|\mathbf{v}\|+\lambda_{1})^{2}$ . Then we have

[TABLE]

By bounding the privacy loss by $\epsilon~{}(\epsilon>0)$ , we have

[TABLE]

Let $\lambda_{max}=\frac{{\Delta}_{A}}{2}(2\sigma^{2}\epsilon-1)$ . To ensure the privacy loss is bounded by $\epsilon$ with probability at least $1-\delta$ , we require

[TABLE]

Recall that $\lambda_{1}\sim\mathcal{N}(0,\sigma^{2}{\Delta}^{2}_{A})$ , we have that

[TABLE]

Then, we have

[TABLE]

Next we will use the tail bound: $\mathrm{Pr}(\lambda_{1}>t)\leq\frac{\sigma{\Delta}_{A}}{\sqrt{2\pi}}e^{-\frac{t^{2}}{2\sigma^{2}{\Delta}^{2}_{A}}}$ . We require:

[TABLE]

Taking $t=\lambda_{max}=\frac{{\Delta}_{A}}{2}(2\sigma^{2}\epsilon-1)$ , we have that

[TABLE]

We will ensure the above inequality by requiring: (1) $\ln\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq 0$ , and (2) $\frac{1}{2}\big{(}\frac{2\sigma^{2}\epsilon-1}{2\sigma}\big{)}^{2}\geq\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})$ .

[TABLE]

We can ensure this inequality (Eq. 26) by setting:

[TABLE]

Let $s=\ln(\sqrt{\frac{2}{\pi}}\frac{1}{\delta})$ . If $s<0$ , the second requirement will always be satisfied, and we only need to choose $\sigma$ satisfying the Condition 1. When $s\geq 0$ , since we already ensure $\frac{2\sigma^{2}\epsilon-1}{2\sigma}\geq 1$ , we have that

[TABLE]

We can ensure the above inequality by choosing:

[TABLE]

Based on the proof above, now we know that to ensure the privacy loss $|\mathcal{L}(\mathbf{o};\mathcal{M},D,D^{\prime})|$ bounded by $\epsilon$ with probability at least $1-\delta$ , we require:

[TABLE]

To compare Condition 2 and Condition 1, we have that

[TABLE]

Since $\delta$ usually is a very small number, i.e., $(1e$ - $5\ll 0.48)$ , without loss of generality, we can assume that Condition 2 always implies Condition 1 in practice. To ensure the privacy loss bounded by $\epsilon$ with probability at least $1-\delta$ , only Condition 2 needs to be satisfied:

[TABLE]

In this proof, the noise $\mathcal{N}(0,\sigma^{2}\Delta_{A}^{2}K\mathbf{r})$ is injected into the model. If we set $\sigma\geq\frac{\sqrt{2}\Delta_{A}}{2\epsilon}(\sqrt{s}+\sqrt{s+\epsilon})$ , then the noise becomes $\mathcal{N}(0,\sigma^{2}K\mathbf{r})$ . Consequently, Theorem 3 does hold.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Abadi et al. [2016] Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan Mc Mahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. ar Xiv:1607.00133 , 2016.
2Bach et al. [2015] Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. P Lo S ONE , 10(7):e 0130140, 07 2015.
3Balle and Wang [2018] Borja Balle and Yu-Xiang Wang. Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning , volume 80 of Proceedings of Machine Learning Research , pages 394–403, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR.
4Carlini and Wagner [2017] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP) , pages 39–57, May 2017.
5Chatzikokolakis et al. [2013] Konstantinos Chatzikokolakis, Miguel E. Andrés, Nicolás Emilio Bordenabe, and Catuscia Palamidessi. Broadening the scope of differential privacy using metrics. In Emiliano De Cristofaro and Matthew Wright, editors, Privacy Enhancing Technologies , pages 82–102, 2013.
6Dong et al. [2017] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Discovering adversarial examples with momentum. Co RR , abs/1710.06081, 2017.
7Dwork and Roth [2014] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science , 9(3–4):211–407, 2014.
8Dwork et al. [2006] C. Dwork, F. Mc Sherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. Theory of Cryptography , pages 265–284, 2006.