A Fundamental Performance Limitation for Adversarial Classification

Abed AlRahman Al Makdah; Vaibhav Katewa; and Fabio Pasqualetti

arXiv:1903.01032·cs.LG·March 18, 2019

A Fundamental Performance Limitation for Adversarial Classification

Abed AlRahman Al Makdah, Vaibhav Katewa, and Fabio Pasqualetti

PDF

Open Access

TL;DR

This paper proves a fundamental tradeoff in adversarial classification, showing that optimizing accuracy inherently increases sensitivity to data manipulation, and this tradeoff is dictated solely by data statistics, not algorithm tuning.

Contribution

It establishes a formal, fundamental limit on the accuracy-sensitivity tradeoff in adversarial classification, independent of specific algorithm choices.

Findings

01

Accuracy-sensitivity tradeoff is unavoidable in adversarial settings.

02

The tradeoff depends only on data statistics, not on algorithm complexity.

03

Tuning algorithms cannot surpass this fundamental limit.

Abstract

Despite the widespread use of machine learning algorithms to solve problems of technological, economic, and social relevance, provable guarantees on the performance of these data-driven algorithms are critically lacking, especially when the data originates from unreliable sources and is transmitted over unprotected and easily accessible channels. In this paper we take an important step to bridge this gap and formally show that, in a quest to optimize their accuracy, binary classification algorithms -- including those based on machine-learning techniques -- inevitably become more sensitive to adversarial manipulation of the data. Further, for a given class of algorithms with the same complexity (i.e., number of classification boundaries), the fundamental tradeoff curve between accuracy and sensitivity depends solely on the statistics of the data, and cannot be improved by tuning the…

Tables1

Table 1. TABLE I: Numerical Results

Classifier	$y_{1}$	$y_{2}$	$𝒮 (y; θ)$	$𝒜 (y; θ)$	$𝒜_{s 1} (x)$	$𝒜_{s 2} (x)$
$ℭ^{1}$	3.65	18.78	0.0334	0.7891	0.6857	0.6808
$ℭ^{2}$	1.83	20.60	0.0201	0.7766	0.6947	0.6939

Equations65

H_{0} : x \sim f_{0} (x; θ_{0}), and H_{1} : x \sim f_{1} (x; θ_{1}),

H_{0} : x \sim f_{0} (x; θ_{0}), and H_{1} : x \sim f_{1} (x; θ_{1}),

C (x; y) = {H_{0}, H_{1}, x \in R_{0}, x \in R_{1},

C (x; y) = {H_{0}, H_{1}, x \in R_{0}, x \in R_{1},

R_{0}

R_{0}

R_{1}

A (y; θ) = p_{0} P [x \in R_{0} ∣ H_{0}] + p_{1} P [x \in R_{1} ∣ H_{1}],

A (y; θ) = p_{0} P [x \in R_{0} ∣ H_{0}] + p_{1} P [x \in R_{1} ∣ H_{1}],

\displaystyle\begin{split}\mathcal{A}(y;\theta)&=p_{0}\Bigg{(}\sum_{l=1}^{n}(-1)^{l+1}\int\limits_{-\infty}^{y_{l}}f_{0}(x;\theta_{0})dx+1\Bigg{)}\\ &+p_{1}\Bigg{(}\sum_{l=1}^{n}(-1)^{l}\int\limits_{-\infty}^{y_{l}}f_{1}(x;\theta_{1})dx\Bigg{)}.\end{split}

\displaystyle\begin{split}\mathcal{A}(y;\theta)&=p_{0}\Bigg{(}\sum_{l=1}^{n}(-1)^{l+1}\int\limits_{-\infty}^{y_{l}}f_{0}(x;\theta_{0})dx+1\Bigg{)}\\ &+p_{1}\Bigg{(}\sum_{l=1}^{n}(-1)^{l}\int\limits_{-\infty}^{y_{l}}f_{1}(x;\theta_{1})dx\Bigg{)}.\end{split}

L (x) = \frac{p _{1} f _{1} ( x ; θ _{1} )}{p _{0} f _{0} ( x ; θ _{0} )} .

L (x) = \frac{p _{1} f _{1} ( x ; θ _{1} )}{p _{0} f _{0} ( x ; θ _{0} )} .

C_{ML} (x; η) = {H_{0}, H_{1}, L (x) < η, L (x) \geq η,

C_{ML} (x; η) = {H_{0}, H_{1}, L (x) < η, L (x) \geq η,

p_{1} f_{1} (x; θ_{1}) - η p_{0} f_{0} (x; θ_{0}) = 0.

p_{1} f_{1} (x; θ_{1}) - η p_{0} f_{0} (x; θ_{0}) = 0.

C_{L} (x; y) = {H_{0}, H_{1}, x < y, x \geq y .

C_{L} (x; y) = {H_{0}, H_{1}, x < y, x \geq y .

A (y; θ)

A (y; θ)

y_{L}^{*} = y_{i} ar g max s.t. A (y_{i}; θ) y_{i} is a solution of \eqref eq:likelihood ratio equality with η = 1.

y_{L}^{*} = y_{i} ar g max s.t. A (y_{i}; θ) y_{i} is a solution of \eqref eq:likelihood ratio equality with η = 1.

a x^{2} + b x + c = 0 where,

a x^{2} + b x + c = 0 where,

\displaystyle a=\frac{1}{2}\Bigg{(}\frac{1}{\sigma_{0}^{2}}-\frac{1}{\sigma_{1}^{2}}\Bigg{)},b=\Bigg{(}\frac{\mu_{1}}{\sigma_{1}^{2}}-\frac{\mu_{0}}{\sigma_{0}^{2}}\Bigg{)},\text{ and }

\displaystyle c=\log\bigg{(}\frac{\sigma_{0}}{\sigma_{1}}\bigg{)}+\log\bigg{(}\frac{p_{1}}{p_{0}}\bigg{)}+\frac{\mu_{0}^{2}}{2\sigma_{0}^{2}}-\frac{\mu_{1}^{2}}{2\sigma_{1}^{2}}-\log(\eta).

S (y; θ) = \frac{\partial A ( y ; θ )}{\partial θ}_{\infty},

S (y; θ) = \frac{\partial A ( y ; θ )}{\partial θ}_{\infty},

\displaystyle\begin{split}\mathcal{A}(y;\theta)&=p_{0}\Big{(}Q\Big{(}\frac{y_{1}-\mu_{0}}{\sigma_{0}}\Big{)}-Q\Big{(}\frac{y_{2}-\mu_{0}}{\sigma_{0}}\Big{)}+1\Big{)}\\ &+p_{1}\Big{(}-Q\Big{(}\frac{y_{1}-\mu_{1}}{\sigma_{1}}\Big{)}+Q\Big{(}\frac{y_{2}-\mu_{1}}{\sigma_{1}}\Big{)}\Big{)}\>\text{and,}\end{split}

\displaystyle\begin{split}\mathcal{A}(y;\theta)&=p_{0}\Big{(}Q\Big{(}\frac{y_{1}-\mu_{0}}{\sigma_{0}}\Big{)}-Q\Big{(}\frac{y_{2}-\mu_{0}}{\sigma_{0}}\Big{)}+1\Big{)}\\ &+p_{1}\Big{(}-Q\Big{(}\frac{y_{1}-\mu_{1}}{\sigma_{1}}\Big{)}+Q\Big{(}\frac{y_{2}-\mu_{1}}{\sigma_{1}}\Big{)}\Big{)}\>\text{and,}\end{split}

\displaystyle\begin{split}\mathcal{S}(y;\theta)&=\left\|\begin{bmatrix}p_{0}\Big{(}f_{0}\big{(}y_{2};\theta_{0}\big{)}-f_{0}(y_{1};\theta_{0})\Big{)}\\ p_{0}\Big{(}\frac{\mu_{0}-y_{1}}{\sigma_{0}}f_{0}(y_{1};\theta_{0})-\frac{\mu_{0}-y_{2}}{\sigma_{0}}f_{0}(y_{2};\theta_{0})\Big{)}\\ p_{1}\Big{(}f_{1}(y_{1};\theta_{1})-f_{1}(y_{2};\theta_{1})\Big{)}\\ p_{1}\Big{(}\frac{\mu_{1}-y_{2}}{\sigma_{1}}f_{1}(y_{2};\theta_{1})-\frac{\mu_{1}-y_{1}}{\sigma_{1}}f_{1}(y_{1};\theta_{1})\Big{)}\end{bmatrix}\right\|_{\infty},\end{split}

\displaystyle\Bigg{(}p_{0}\frac{\partial}{\partial y_{i}}f_{0}(y_{i};\theta_{0})\Bigg{|}_{y_{i}^{*}}\!\!-p_{1}\frac{\partial}{\partial y_{i}}f_{1}(y_{i};\theta_{1})\Bigg{|}_{y_{i}^{*}}\Bigg{)}\frac{\partial y_{i}^{*}}{\partial\theta^{(j)}}\neq 0.

\displaystyle\Bigg{(}p_{0}\frac{\partial}{\partial y_{i}}f_{0}(y_{i};\theta_{0})\Bigg{|}_{y_{i}^{*}}\!\!-p_{1}\frac{\partial}{\partial y_{i}}f_{1}(y_{i};\theta_{1})\Bigg{|}_{y_{i}^{*}}\Bigg{)}\frac{\partial y_{i}^{*}}{\partial\theta^{(j)}}\neq 0.

\frac{\partial S ( y ; θ )}{\partial y}_{y^{*}} \neq = 0.

\frac{\partial S ( y ; θ )}{\partial y}_{y^{*}} \neq = 0.

\displaystyle\frac{\mathsf{d}g\big{(}y^{*};\theta\big{)}}{\mathsf{d}\theta^{(j)}}=\frac{\partial g\big{(}y;\theta\big{)}}{\partial\theta^{(j)}}\Bigg{|}_{y^{*}}+\frac{\partial g\big{(}y;\theta\big{)}}{\partial y}\Bigg{|}_{y^{*}}\frac{\partial y^{*}}{\partial\theta^{(j)}}=0,

\displaystyle\frac{\mathsf{d}g\big{(}y^{*};\theta\big{)}}{\mathsf{d}\theta^{(j)}}=\frac{\partial g\big{(}y;\theta\big{)}}{\partial\theta^{(j)}}\Bigg{|}_{y^{*}}+\frac{\partial g\big{(}y;\theta\big{)}}{\partial y}\Bigg{|}_{y^{*}}\frac{\partial y^{*}}{\partial\theta^{(j)}}=0,

\displaystyle\Rightarrow\frac{\partial}{\partial y}\frac{\partial\mathcal{A}(y;\theta)}{\theta^{(j)}}\Bigg{|}_{y^{*}}=-\frac{\partial^{2}\mathcal{A}(y;\theta)}{\partial y^{2}}\Bigg{|}_{y^{*}}\frac{\partial y^{*}}{\partial\theta^{(j)}},

w_{i} (y_{i}) = p_{0} (- 1)^{i + 1} \frac{\partial}{\partial y _{i}} f_{0} (y_{i}; θ_{0}) + p_{1} (- 1)^{i} \frac{\partial}{\partial y _{i}} f_{1} (y_{i}; θ_{1}) .

w_{i} (y_{i}) = p_{0} (- 1)^{i + 1} \frac{\partial}{\partial y _{i}} f_{0} (y_{i}; θ_{0}) + p_{1} (- 1)^{i} \frac{\partial}{\partial y _{i}} f_{1} (y_{i}; θ_{1}) .

S (y^{*} + δ; θ) < S (y^{*}; θ) and A (y^{*} + δ; θ) \leq A (y^{*}; θ) .

S (y^{*} + δ; θ) < S (y^{*}; θ) and A (y^{*} + δ; θ) \leq A (y^{*}; θ) .

\frac{\partial S ( y ; θ )}{\partial y}_{y_{L}^{*}} \neq = 0.

\frac{\partial S ( y ; θ )}{\partial y}_{y_{L}^{*}} \neq = 0.

\frac{\partial S ( y ( η , θ ) ; θ )}{\partial η}_{η = 1} \neq = 0.

\frac{\partial S ( y ( η , θ ) ; θ )}{\partial η}_{η = 1} \neq = 0.

\displaystyle\left.\frac{\partial{\mathcal{S}\big{(}y(\eta,\theta);\theta\big{)}}}{\partial\eta}\right|_{\eta=1}

\displaystyle\left.\frac{\partial{\mathcal{S}\big{(}y(\eta,\theta);\theta\big{)}}}{\partial\eta}\right|_{\eta=1}

y min s.t. S (y; θ) A (y; θ) = ζ,

y min s.t. S (y; θ) A (y; θ) = ζ,

S

S

f_{0} (x, θ_{0}) = N (x; μ_{0}, σ_{0}), f_{1} (x, θ_{1}) = N (x; μ_{1}, σ_{1}) .

f_{0} (x, θ_{0}) = N (x; μ_{0}, σ_{0}), f_{1} (x, θ_{1}) = N (x; μ_{1}, σ_{1}) .

f_{0} (x, θ_{0}) f_{1} (x, θ_{1}) = N (x; μ_{0} + \overset{μ}{ˉ}_{0}, σ_{0} + \overset{σ}{ˉ}_{0}), and = N (x; μ_{1} + \overset{μ}{ˉ}_{1}, σ_{1} + \overset{σ}{ˉ}_{1}),

f_{0} (x, θ_{0}) f_{1} (x, θ_{1}) = N (x; μ_{0} + \overset{μ}{ˉ}_{0}, σ_{0} + \overset{σ}{ˉ}_{0}), and = N (x; μ_{1} + \overset{μ}{ˉ}_{1}, σ_{1} + \overset{σ}{ˉ}_{1}),

\displaystyle\begin{aligned} &\underset{\theta\in\Theta}{\text{min}}&&\mathcal{S}\big{(}y^{*}(\theta),\theta\big{)}\\ &\text{s.t.}&&\mathcal{A}\big{(}y^{*}(\theta),\theta\big{)}=\gamma,&&\end{aligned}

\displaystyle\begin{aligned} &\underset{\theta\in\Theta}{\text{min}}&&\mathcal{S}\big{(}y^{*}(\theta),\theta\big{)}\\ &\text{s.t.}&&\mathcal{A}\big{(}y^{*}(\theta),\theta\big{)}=\gamma,&&\end{aligned}

S (y^{*} (θ); θ)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification

Full text

A Fundamental Performance Limitation for Adversarial

Classification

Abed AlRahman Al Makdah, Vaibhav Katewa, and Fabio Pasqualetti This work was supported in part by ARO award 71603NSYIP. The authors are with the Departments of Department of Electrical and Computer Engineering and Mechanical Engineering at the University of California, Riverside, {abedam,vkatewa, fabiopas}@engr.ucr.edu.

Abstract

Despite the widespread use of machine learning algorithms to solve problems of technological, economic, and social relevance, provable guarantees on the performance of these data-driven algorithms are critically lacking, especially when the data originates from unreliable sources and is transmitted over unprotected and easily accessible channels. In this paper we take an important step to bridge this gap and formally show that, in a quest to optimize their accuracy, binary classification algorithms – including those based on machine-learning techniques – inevitably become more sensitive to adversarial manipulation of the data. Further, for a given class of algorithms with the same complexity (i.e., number of classification boundaries), the fundamental tradeoff curve between accuracy and sensitivity depends solely on the statistics of the data, and cannot be improved by tuning the algorithm.

I Introduction

Artificial intelligence and machine learning algorithms, including neural networks, are used widely in technological, social and economic applications, such as computer vision [1, 2], speech recognition [3, 4], and malware detection [5]. While these algorithms typically achieve high performance under nominal and well-modeled conditions, they are also very sensitive to small, targeted, and possibly malicious manipulations of the training and execution data [6]. The reasons for this unreliable behavior are still largely unknown, thus motivating the critical need for novel theories and tools to deploy robust, reliable, and safe data-driven algorithms.

In this paper we formally reveal a fundamental and previously unknown tradeoff between the accuracy of a binary classification algorithm and its sensitivity to arbitrary manipulation of the data. In particular, we cast a binary classification problem into an hypothesis testing framework, parametrize classification algorithms – including those based on machine learning techniques – using their decision boundaries, and show that the accuracy of the algorithm can be maximized only at the expenses of its sensitivity. This tradeoff, which applies to general classification algorithms, depends on the statistics of the data, and cannot be improved by simply tuning the algorithm. Our theory explains quantitatively how simple algorithms can outperform more complex implementations when operating in adversarial environments.

Related work: Recent work has shown that classification based on neural networks is vulnerable to adversarial perturbations [6, 7], and that these perturbations are universal and affect a large number of classification algorithms. While heuristic explanations of this phenomenon and heuristic techniques have been proposed, including adversarial learning [7, 8, 9, 10, 11, 12], black-box [8], and gradient-based [7, 9], a fundamental analytical understanding of the limitations of classification algorithms under adversarial perturbations is critically lacking. We identify these limitations for a binary classification problem in a Bayesian setting. While in simple setting, our analysis formally shows that a fundamental tradeoff exists between accuracy and sensitivity of any classification algorithm, independently of the complexity of the algorithm. The papers [10, 13, 14] are also related to this study, which derive methods to measure robustness of different classifiers against adversarial perturbations and obtain guarantees against bounded perturbations, as well as [11], which shows how adversarial training improves the classifier’s performance against adversarial perturbations while deteriorating its performance under nominal conditions. Our approach provides rigorous mathematical support to the empirical evidence obtained in these works.

Contribution: This paper features three main contributions. First, we propose metrics to quantify the accuracy of a classification algorithm and its sensitivity to arbitrary manipulation of the data. We prove that, under a set of mild technical assumptions, the accuracy of a classification algorithm can only be maximized at the expenses of its sensitivity. Thus, a fundamental tradeoff exists between the performance of a classification algorithm in nominal and adversarial settings. While our results formally apply to binary classification problems, we conjecture that this fundamental tradeoff in fact applies to more general classification problems. Second, we show that a tradeoff between accuracy and sensitivity exists for different classes of classification algorithms, and that simpler algorithms can sometimes outperform more complex one in adversarial settings. Third, for a fixed complexity of the classification algorithm, we numerically show that the accuracy versus robustness tradeoff depends solely on the statistics of the data, and cannot be arbitrarily improved by tuning the classification algorithm, including using sophisticated adversarial learning techniques. Taken together, our results suggest that performance and robustness of data-driven algorithms are dictated by the properties of the data, and not by the sophistication or intelligence of the algorithm.

II Problem setup and preliminary notions

To reveal a fundamental tradeoff between the accuracy of a classification algorithm and its robustness against malicious data manipulation, we consider a binary classification problem where the objective is to decide whether a scalar observation $x\in\mathbb{R}$ belongs to one of the classes $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ . We assume that the distribution of the observations satisfy

[TABLE]

where $f_{0}(x;\theta_{0})$ and $f_{1}(x;\theta_{1})$ are arbitrary, yet known, probability density functions with parameters $\theta_{0}\in\mathbb{R}^{m_{0}}$ and $\theta_{1}\in\mathbb{R}^{m_{1}}$ , respectively. We assume that the partial derivatives of $f_{k}$ with respect to $x$ and $\theta_{k}$ exist and are continuous over the domain of the distributions, for $k=0,1$ . Let $p_{0}$ and $p_{1}$ denote the prior probabilities of the observations belonging to the classes $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ , respectively. Different (machine learning) algorithms can be used to solve the above binary classification problem. Yet, because of the binary nature of the problem, any classification algorithm can be represented by a suitable partition of the real line, and it can be written as

[TABLE]

where111For simplicity and without affecting generality, we assume that $n$ is even. Further, an alternative configuration of the classifier (2) assigns $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ to $\mathcal{R}_{1}$ and $\mathcal{R}_{0}$ , respectively. However, because accuracy and sensitivity of the two configurations can be obtained from each other, we consider only the configuration in (2) without affecting the generality of our analysis. $y=[y_{i}]$ denotes a set of boundary points, with $y_{0}\leq\dots\leq y_{n+1}$ , $y_{0}=-\infty$ , $y_{n+1}=\infty$ , and

[TABLE]

We refer to (2) as general classifier. We measure the performance of a classification algorithm through its accuracy, that is, its probability of making a correct classification.

Definition 1

(Accuracy of a classifier)* The accuracy of the classification algorithm $\mathfrak{C}(x;y)$ is*

[TABLE]

where $\theta=[\theta_{0}^{\mathsf{T}}\;\theta_{1}^{\mathsf{T}}]^{\mathsf{T}}$ contains the distribution parameters. $\square$

Using Equation (3) and the distributions in (1), we obtain

[TABLE]

Clearly, the accuracy of a classification algorithm depends on the position of its boundaries, which can be selected to maximize the accuracy of the classification algorithm. To this aim, let $L(x)$ denote the Likelihood Ratio defined as

[TABLE]

The Maximum Likelihood (ML) classifier is

[TABLE]

where the threshold $\eta>0$ is a design parameter that determines the boundary points and, thus, the accuracy of the classifier. As a known result in statistical hypothesis testing [15], the accuracy of the ML classifier with $\eta=1$ is the largest among all possible classifiers. The value and the number of boundary points of the ML classifier depend on the distributions $f_{0}(x;\theta_{0})$ and $f_{1}(x;\theta_{1})$ , the threshold $\eta$ , and the prior probabilities through the equation

[TABLE]

Another important class of classifiers is the class of linear classifiers, which are less complex and often achieve a competitive performance compared to nonlinear classifiers (see [16] for more details). In our setting, a linear classifier consists of one decision boundary $y\in\mathbb{R}$ , and is given by

[TABLE]

Following Definition 1, the accuracy of $\mathfrak{C}_{\text{L}}$ is

[TABLE]

The optimal boundary $y_{\text{L}}^{*}$ that maximizes $\mathcal{A}(y;\theta)$ is

[TABLE]

While the boundaries are difficult to compute for general distributions, they can be computed explicitly when the observations are Gaussian (see below). Let $\mathcal{N}(x;\mu,\sigma)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}}$ be the p.d.f. of a normal random variable with mean $\mu$ and variance $\sigma$ , and $Q(z)=\int_{-\infty}^{z}\frac{1}{\sqrt{2\pi}}e^{\frac{-x^{2}}{2}}dx$ the c.d.f. of the standard normal distribution.

Remark 1

(ML and linear classifiers for Gaussian distributions)* For the Gaussian distributions $f_{i}(x;\theta_{i})=\mathcal{N}(x;\mu_{i},\sigma_{i})$ , $i=0,1$ , the boundaries of ML classifier satisfy*

[TABLE]

Equation (10) has at most two real solutions, implying that the ML classifier has at most two decision boundaries (see Fig. 1). The ML classifier with boundaries corresponding to the solutions of (10) with $\eta=1$ has maximum accuracy [15]. The solution of (10) which maximizes the accuracy in (8) is the boundary for the optimal linear classifier. $\square$

To characterize the robustness of a classifier to adversarial manipulation of the observations, we define the following sensitivity metric, which capture the degradations of the classification accuracy following data manipulation. It should be noticed that, by manipulating the observations, the adversary effectively changes the parameters of the distributions in (1).

Definition 2

(Sensitivity of a classifier)* The sensitivity of the classification algorithm222Definition 2 is also valid for the ML and the linear classifier. $\mathfrak{C}(x;y)$ is*

[TABLE]

where $\theta$ contains the parameters of the distributions in (1), and $\mathcal{A}(y;\theta)$ denotes the accuracy of $\mathfrak{C}(x;y)$ . $\square$

From Definition 2, a higher value of sensitivity implies that the adversary can affect the classifier’s performance to a larger extent, whereas a lower sensitivity implies that the classifier is more robust to adversarial manipulation. Further, the $\infty-$ norm captures the worst case scenario in terms of the largest sensitivity with respect to the components of $\theta$ .

Remark 2

(Accuracy and sensitivity of the ML classifier for Gaussian distributions)* The accuracy and the sensitivity of the ML classifier are obtained by substituting the expression of the normal distributions $\mathcal{N}(x;\mu_{i},\sigma_{i})$ in (3) and (11):*

[TABLE]

where $\theta_{i}=[\mu_{i}\;\sigma_{i}]^{\mathsf{T}}$ and $i=0,1$ . $\square$

A classification algorithm should be designed to have high accuracy and low sensitivity, so as to exhibit robust satisfactory performance in the face of adversarial manipulation. Unfortunately, in this paper we show that accuracy and sensitivity are directly related, so that optimizing the accuracy of a classifier inevitably also increases its sensitivity.

III A fundamental tradeoff between accuracy and sensitivity of

classification algorithms

In this section, we characterize the tradeoff between accuracy and sensitivity of a classification algorithm for a given binary classification problem as described in (1). In particular, we prove that under some mild conditions, there exist a classifier that is less accurate than $\mathfrak{C}_{\text{ML}}(x;1)$ , yet more robust to adversarial manipulation of the observations. This shows that there exist a tradeoff between accuracy and sensitivity at the the maximum accuracy configuration.

Let $y^{*}=[y_{1}^{*}\;y_{2}^{*}\;\cdots\;y_{n}^{*}]^{\mathsf{T}}$ be the vector of the boundaries of $\mathfrak{C}_{\text{ML}}(x;1)$ , which maximizes $\mathcal{A}(y;\theta)$ . Let $\theta^{(i)}$ be the $i^{\text{th}}$ component of $\theta$ . We make the following assumptions:

A1:

The vector $\frac{\partial\mathcal{A}(y;\theta)}{\partial\theta}\Bigr{|}_{y^{*}}$ has a unique largest absolute element, located at index $j$ .

A2:

There exist at least one boundary $y_{i}^{*}$ such that

[TABLE]

Assumptions A1 is specific to our definition of sensitivity in (11), and is not required if $2-$ norm is used (see Remark 4). Further, A2 is mild and typically satisfied in most problems.

Theorem III.1

(Accuracy vs. sensitivity tradeoff for classifier (2))* Let $y^{*}$ contain the boundaries of the classifier $\mathfrak{C}_{\text{ML}}(x;1)$ . Then, under Assumptions A1 and A2, it holds*

[TABLE]

Proof:

Assumption A1 guarantees that sensitivity $\mathcal{S}(y;\theta)$ is differentiable with respect to $y$ at $y^{*}$ . Let $g\big{(}y;\theta\big{)}\triangleq\frac{\partial\mathcal{A}(y;\theta)}{\partial y}$ . Since $y^{*}$ maximizes $\mathcal{A}(y;\theta)$ , we have $g\big{(}y^{*};\theta\big{)}=0$ . Differentiating $g\big{(}y^{*};\theta\big{)}$ with respect to $\theta^{(j)}$ , and noting that $y^{*}$ depends on $\theta$ , we get:

[TABLE]

where the last equation follows by substituting $g\big{(}y;\theta\big{)}=\frac{\partial\mathcal{A}(y;\theta)}{\partial y}$ and switching the order of partial differentiation. Using (11), it can be easily observed that the left side of (III) equals $\pm\frac{\partial\mathcal{S}(y;\theta)}{\partial y}\Big{|}_{y^{*}}$ . Further, differentiating (4) twice, we get $\left.\frac{\partial^{2}}{\partial y^{2}}\mathcal{A}(y;\theta)=\operatorname{diag}(w_{1}(y_{1}),\cdots,w_{n}(y_{n}))\right.$ , where

[TABLE]

Assumption A2 guarantees that there exist a boundary $y_{i}^{*}$ such that $w_{i}(y_{i}^{*})\frac{\partial y_{i}^{*}}{\partial\theta^{(j)}}\neq 0$ . The reult follows from (III). ∎

Theorem III.1 implies that the sensitivity of the classifier $\mathfrak{C}(x;y)$ can be decreased by modifying the boundaries $y^{*}$ . Yet, because $\mathfrak{C}(x;y^{*})$ exhibits the largest classification accuracy among all classifiers, the reduction of sensitivity inevitably decreases the accuracy of classification. In other words, for any classification problem (1) satisfying Assumption A1 and A2 and for any classification algorithm (2), there exists an arbitrarily small $\delta$ such that333The inequality for accuracy is strict for most distributions.

[TABLE]

This result also implies that the robustness of a classification algorithm to adversarial manipulation of the data can be increased only at the expense of its accuracy of classification. Thus, a fundamental tradeoff exists between the accuracy of a classifier and its robustness to adversarial manipulation.

Corollary III.2

(Accuracy and sensitivity of the linear classifier (7))* Let $y_{\text{L}}^{*}$ be the boundary given in (9) that maximizes the accuracy (in (8)) of the linear classifier $\mathfrak{C}_{\text{L}}(x;y)$ . Then, under Assumptions A1 and A2, it holds*

[TABLE]

Proof:

Since $y_{\text{L}}^{*}$ corresponds to one of the boundaries contained in $y^{*}$ , the proof follows from Theorem III.1. ∎

Next, we show that this tradeoff also exists for the Maximum Likelihood classifier. This fact does not follow trivially from Theorem III.1, because the general classifier in Theorem has independent boundaries, while the boundaries of the Maximum Likelihood are dependent from one another via (6). We make the following mild technical assumption.

A3:

The vectors $\left.\frac{\partial y(\eta,\theta)}{\partial\eta}\right|_{\eta=1}$ and $\left.\frac{\partial{\mathcal{S}(y;\theta)}}{\partial y}\right|_{y^{*}}$ are not orthogonal, where $y(\eta,\theta)$ contains the boundaries of $\mathfrak{C}_{\text{ML}}(x;\eta)$ .

Lemma III.3

(Accuracy and sensitivity of the ML classifier (5))* Let $y(\eta,\theta)$ contain the boundaries of the classifier $\mathfrak{C}_{\text{ML}}(x;\eta)$ . Then, under Assumptions A1, A2 and A3, it holds*

[TABLE]

Proof:

Let $y^{*}$ contain the boundaries of the classifier $\mathfrak{C}_{\text{ML}}(x;\eta=1)$ . The derivative of $\mathcal{S}\big{(}y(\eta,\theta);\theta\big{)}$ with respect to $\eta$ can be written as:

[TABLE]

We conclude following Theorem III.1 and Assumption A3. ∎

In what follows we numerically show that a tradeoff between accuracy and sensitivity also exists when the classification boundaries are not selected to maximize the accuracy of the classifier. To this aim, first we compute the accuracy and sensitivity of the ML classifier $\mathfrak{C}_{\text{ML}}(x;\eta)$ , for different values of $\eta$ . Notice that, by varying $0\leq\eta<\infty$ , Equation (6) returns different classification boundaries and, thus, different classification algorithms. Similarly, we compute the accuracy and sensitivity of linear classifier $\mathfrak{C}_{\text{L}}(x;y)$ by varying the single boundary $y$ . Second, we numerically solve

[TABLE]

for different values of $\zeta$ ranging from $0.5$ to $\mathcal{A}(y^{*};\theta)$ . Notice that the minimization problem (15) returns the classifier with lowest sensitivity and accuracy equal to $\zeta$ , and that the boundaries solving the minimization problem (15) may not satisfy (6). Further, for a given number of classification boundaries, the minimization problem (15) returns a fundamental tradeoff curve relating accuracy and sensitivity over the range of $\zeta$ , which is independent of the choice of classification algorithm. Finally, the minimization problem (15) is not convex, because of its nonlinear equality constraint.

Fig. 2(a) shows the accuracy-sensitivity tradeoff for the Gaussian hypothesis testing problem discussed in Remark 2. In this case, since the ML classifier has $2$ boundaries, we also consider general classifiers with $2$ boundaries. We observe that the general classifier exhibits the tradeoff at the maximum accuracy point (identified by the red dot) in accordance with Theorem III.1. Several comments are in order. First, the ML and linear classifiers also exhibit tradeoff at their respective maximum accuracy points in accordance with Lemmas III.3 and III.2. Second, the tradeoff for the ML classifier is not strict and there exist points where reducing accuracy increases sensitivity (green dot in the figure). On the other hand, the tradeoff for the general classifier is strict. This is because the decision boundaries of the general classifier can be varied independently, whereas the boundaries of the ML classifier are related to each other since they are the solutions of (6). Thus, the general classifier provides more flexibility in choosing the boundaries. Similarly, the tradeoff for the linear classifier is not strict. Third, the tradeoff curve for the general classifier is below the tradeoff curves for the ML and linear classifier, again, due to the aforementioned reason.444ML and linear classifiers are particular instances of the general classifier. Fourth, the maximum accuracy of the linear classifier (corresponding to red square) is smaller than that of ML classifier (corresponding to the red dot), but its sensitivity at the maximum accuracy configuration is also smaller than that of the ML classifier. This explains the observed phenomena that in some cases, linear models are more robust to adversarial attacks than nonlinear models (for example, neural networks) [17]. Finally, the curves are not smooth because of the $\infty$ -norm in the definition (2).

We conclude with two remarks on using the $2$ -norm to define sensitivity and on the necessity of Assumption A1.

Remark 3

(Classification sensitivity using the $2-$ norm)* In Definition 2, the $\infty$ -norm captures the largest change in accuracy with respect to a change in a single component of parameters vector $\theta$ . Instead, using the $2$ -norm to define the sensitivity of a classification algorithm leads to*

[TABLE]

which captures the change in accuracy with respect to changes in all the components of $\theta$ . Fig. 2(b) shows the sensitivity versus accuracy tradeoff when sensitivity is defined using (16) instead of (11). Here, a strict tradeoff exists for the general, ML and linear classifiers. Further, the tradeoff curves are smooth since the $2$ -norm is a smooth function. $\square$

Remark 4

(Necessity of Assumption A1)* Theorem III.1 may not hold when Assumption A1 is not satisfied, and we illustrate this fact in Fig. 2(c). In this case, the vector $\frac{\partial\mathcal{A}(y^{*};\theta)}{\partial\theta}=[0.043\;0.024\;-0.043\;0.040]^{\mathsf{T}}$ has two elements with maximum absolute value, violating Assumption A1. As a result, a tradeoff at the maximum accuracy point (denoted by red dot) does not exists as shown in the figure. Yet, a tradeoff still exist for sensitivity defined as in (16), indicating that A1 might be required only for definition (11). $\square$ *

IV An illustrative example

In this section we illustrate numerically the implications of Theorem III.1. In particular, we consider two classification algorithms with different accuracy and sensitivity, and show how their performance degrades differently when the observations are corrupted by an adversary. This implies that, when robustness to adversarial manipulation of the observations is a concern, classification algorithms should be designed to simultaneously optimize accuracy and sensitivity, and should not operate at their point of maximum accuracy.

Consider the classification problem (1), and let

[TABLE]

Let $\mathfrak{C}^{1}=\mathfrak{C}_{\text{ML}}(x;1)$ and $\mathfrak{C}^{2}=\mathfrak{C}_{\text{ML}}(x;0.4603)$ be the classification algorithms identified by the red and green points in Fig. 2(a), respectively. Notice that, when the observations are not manipulated and follow the distributions (17), $\mathfrak{C}^{1}$ achieves higher accuracy and sensitivity than $\mathfrak{C}^{2}$ . This is also the case when using definition (16), as illustrated in Fig. 2(b). While the nominal distributions (17) are used to design the classifiers $\mathfrak{C}^{1}$ and $\mathfrak{C}^{2}$ , we consider an adversary that manipulates the observations so that their true distributions are

[TABLE]

where $\bar{\mu}_{0}$ , $\bar{\mu}_{1}$ , $\bar{\sigma}_{0}$ , and $\bar{\sigma}_{1}$ are unknown parameters selected by the adversary to deteriorate the accuracy of the classifiers.

To evaluate the accuracy of $\mathfrak{C}^{1}$ and $\mathfrak{C}^{2}$ to classify the manipulated observations, we generate $10000$ observations obeying the modified distributions (18), and compute the accuracy of the classifiers as the ratio of the number of correct predictions to the total number of observations. We repeat this experiment $100$ times, and then compute the average accuracy of the classifiers over all trials.

Table I summarizes the results of the classification problems with $\mathfrak{C}^{1}$ and $\mathfrak{C}^{2}$ on the altered observations. In particular, $y_{1}$ and $y_{2}$ are the decision boundaries of the classifiers, while $\mathcal{S}(y;\theta)$ and $\mathcal{A}(y;\theta)$ denote their nominal sensitivity and accuracy. Instead, $\mathcal{A}_{s1}(x)$ and $\mathcal{A}_{s2}(x)$ denote the average accuracy of the classifiers when, respectively, the adversarial parameters are $\bar{\mu}_{1}=\bar{\mu}_{0}=\bar{\sigma}_{0}=0$ , $\bar{\sigma}_{1}=3$ , and $\bar{\mu}_{0}=1$ , $\bar{\sigma}_{0}=2$ , $\bar{\mu}_{1}=-2$ , $\bar{\sigma}_{1}=1.5$ . The results show that, although $\mathfrak{C}^{1}$ exhibits higher accuracy that $\mathfrak{C}^{2}$ when the observations follow the nominal distributions (17), $\mathfrak{C}^{2}$ outperforms $\mathfrak{C}^{1}$ in both adversarial scenarios, as supported by our analysis.

V Dependency of Accuracy and Sensitivity on the parameters of

the distributions

In this section we analyze the effect of the parameters $\theta=[\theta_{0}\;\theta_{1}]^{\mathsf{T}}$ on the accuracy and sensitivity of the classifiers. We consider the Maximum Likelihood classifier $\mathfrak{C}_{\text{ML}}(x;\eta=1)$ for the analysis since it maximizes the accuracy.555A similar analysis can also be performed for general and linear classifiers. However, we omit this analysis due to space constraints. Specifically, we wish to determine the distribution parameters that minimize the sensitivity while providing a given level of accuracy. We consider the following problem:

[TABLE]

where $y^{*}(\theta)$ denotes the boundaries of the ML classifier $\mathfrak{C}_{\text{ML}}(x;1)$ , which depend on $\theta$ via (6), $0.5\leq\gamma\leq 1$ denotes the accuracy level, and $\Theta$ denotes the set of admissible parameters $\theta$ of the distributions. The optimization problem (19) captures the fundamental limit of sensitivity that can be achieved by a ML classifier with a desired level of accuracy. Note that, similarly to (15), the optimization problem in (19) is not convex due to the nonlinear equality constraint.

Let $\theta^{*}(\gamma)$ and $\mathcal{S}^{*}(\gamma)$ denote the optimal parameters and minimum sensitivity of the optimization problem in (19). Fig. 3(a) shows the variation of $\mathcal{S}^{*}(\gamma)$ as a function of accuracy level $\gamma$ for the Gaussian hypothesis testing problem detailed in Remark 2. It can be observed that $\mathcal{S}^{*}(\gamma)$ is a decreasing function of $\gamma$ . This is due to the fact that, to achieve a higher level of accuracy, the “separation” between the two distributions should be larger, as evident in Fig. 3(b). At a larger separation, the effect of changes in the distribution parameters on the accuracy of the classifier is smaller, thereby resulting in a smaller sensitivity.

Lemma V.1

(Accuracy and sensitivity for Gaussian testing)* Consider an hypothesis testing problem with $f_{0}=\mathcal{N}(x;\mu_{0},\sigma)$ and $f_{1}=\mathcal{N}(x;\mu_{1},\sigma)$ , with $\theta=[\mu_{0}\;\mu_{1}]^{\mathsf{T}}$ and $p_{0}=p_{1}=0.5$ . Assume that $\sigma$ is fixed. Then, for classifier $\mathfrak{C}_{\text{ML}}(x;1)$ , $\mathcal{S}^{*}(\gamma)$ is a decreasing function of accuracy $\gamma$ .*

Proof:

For the Gaussian testing problem with $\sigma_{0}=\sigma_{1}=\sigma$ , $p_{0}=p_{1}=0.5$ , Equation (6) has a single solution for $\eta=1$ given by $y^{*}(\theta)=\frac{\mu_{0}+\mu_{1}}{2}$ . Using (8), the accuracy is given by $\mathcal{A}(y^{*}(\theta);\theta)=Q\left(\frac{|\mu_{1}-\mu_{0}|}{2\sigma}\right)$ . Since $\sigma$ is fixed, we take the derivative of $\mathcal{A}(y^{*}(\theta);\theta)$ with respect to the means:

[TABLE]

To conclude, $\mathcal{A}(y^{*}(\theta);\theta)$ and $\mathcal{S}(y^{*}(\theta);\theta)$ are increasing and decreasing functions of $|\mu_{1}-\mu_{0}|$ , respectively. ∎

Lemma V.2

(Accuracy and sensitivity for Exponential testing)* Consider an hypothesis testing problem with $f_{0}(x;\lambda_{0})=\lambda_{0}e^{-\lambda_{0}x}$ and $f_{1}(x;\lambda_{1})=\lambda_{1}e^{-\lambda_{1}x}$ , with $x\geq 0$ , $\theta=\lambda_{1}$ , and $p_{0}=p_{1}=0.5$ . Then, for classifier $\mathfrak{C}_{\text{ML}}(x;1)$ and a fixed $\lambda_{0}$ , $\mathcal{S}^{*}(\gamma)$ is a decreasing function of accuracy $\gamma$ .*

Proof:

Without loss of generality, we assume $0<\lambda_{0}<\lambda_{1}$ . For $p_{0}=p_{1}=0.5$ , Equation (6) has a single solution for $\eta=1$ given by $y^{*}(\theta)=\frac{1}{\lambda_{1}-\lambda_{0}}\log(\frac{\lambda_{1}}{\lambda_{0}})$ . Using (8),

[TABLE]

where $r=\frac{\lambda_{1}}{\lambda_{0}}$ . The sensitivity is given by

[TABLE]

To conclude, by inspecting the derivatives of $\mathcal{A}(y^{*}(\theta);\theta)$ and $\mathcal{S}(y^{*}(\theta);\theta)$ with respect to $r$ , it can be seen that they are increasing and decreasing functions of $r$ , respectively. ∎

VI Conclusion and future work

In this paper we show that a fundamental tradeoff exists between the accuracy of a binary classification algorithm and its sensitivity to adversarial manipulation of the data. Thus, accuracy can only be maximized at the expenses of the sensitivity to data manipulation, and this tradeoff cannot be arbitrarily improved by tuning the algorithm’s parameters. Directions of future interest include the extension to M-ary testing problems, as well as the formal characterization of the relationships between the complexity of the classification algorithm and its accuracy versus sensitivity tradeoff.

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Y. Le Cun, K. Kavukcuoglu, and C. Farabet. Convolutional networks and applications in vision. In International Symposium on Circuits and Systems , pages 253–256, Paris, France, May 2010.
2[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems , pages 1097–1105, Lake Tahoe, NV, USA, Dec 2012.
3[3] G. E. Dahl, D. Yu, L. Deng, and A. Acero. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on audio, speech, and language processing , 20(1):30–42, 2012.
4[4] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal processing magazine , 29(6):82–97, 2012.
5[5] G. E. Dahl, J. W. Stokes, L. Deng, and D. Yu. Large-scale malware classification using random projections and neural networks. In International Conference on Acoustics, Speech and Signal Processing , pages 3422–3426, Vancouver, BC, Canada, May 2013.
6[6] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations , Banff, Canada, Apr 2014.
7[7] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations , San Diego, CA, USA, May 2015.
8[8] D. Lowd and C. Meek. Adversarial learning. In International Conference on Knowledge Discovery in Data Mining , pages 641–647, Chicago, IL, USA, Aug 2005.