Reliable Smart Road Signs
Muhammed O. Sayin, Chung-Wei Lin, Eunsuk Kang, Shinichi Shiraishi,, Tamer Basar

TL;DR
This paper introduces a game-theoretic detection mechanism for smart road signs using error-correction codes to enhance robustness against adversarial attacks, ensuring reliable classification in intelligent transportation systems.
Contribution
It proposes a novel game-theoretic approach combined with error-correction methods for robust detection of smart road signs against large-scale adversarial perturbations.
Findings
Effective detection strategy under worst-case attacks
Robustness demonstrated across various scenarios
Enhanced security for smart transportation systems
Abstract
In this paper, we propose a game theoretical adversarial intervention detection mechanism for reliable smart road signs. A future trend in intelligent transportation systems is ``smart road signs" that incorporate smart codes (e.g., visible at infrared) on their surface to provide more detailed information to smart vehicles. Such smart codes make road sign classification problem aligned with communication settings more than conventional classification. This enables us to integrate well-established results in communication theory, e.g., error-correction methods, into road sign classification problem. Recently, vision-based road sign classification algorithms have been shown to be vulnerable against (even) small scale adversarial interventions that are imperceptible for humans. On the other hand, smart codes constructed via error-correction methods can lead to robustness against small…
| RS-Code | distinct road signs | bits | |||
|---|---|---|---|---|---|
| RS-Code | ||||
|---|---|---|---|---|
| RS-Code | ||||
|---|---|---|---|---|
| RS-Code | ||||
|---|---|---|---|---|
| RS-Code | ||||
|---|---|---|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Reliable Smart Road Signs
Muhammed O. Sayin, Chung-Wei Lin, Eunsuk Kang,
Shinichi Shiraishi, and Tamer Başar This research was partially supported by the U.S. Office of Naval Research (ONR) MURI grant N00014-16-2710.M. O. Sayin and T. Başar are with the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champain, Urbana, IL, 61801, USA (e-mail: {sayin2,basar1}@illinois.edu).C.-W. Lin is with National Taiwan University, Taipei, Taiwan (email: [email protected]).E. Kang is with Carnegie Mellon University, Pittsburgh, PA, 15213 USA (email: [email protected]).S. Shiraishi is with Toyota InfoTechonology Center Co., Ltd., Minato-ku, Tokyo, 107-0052, Japan (e-mail: [email protected]).
Abstract
In this paper, we propose a game theoretical adversarial intervention detection mechanism for reliable smart road signs. A future trend in intelligent transportation systems is “smart road signs” that incorporate smart codes (e.g., visible at infrared) on their surface to provide more detailed information to smart vehicles. Such smart codes make road sign classification problem aligned with communication settings more than conventional classification. This enables us to integrate well-established results in communication theory, e.g., error-correction methods, into road sign classification problem. Recently, vision-based road sign classification algorithms have been shown to be vulnerable against (even) small scale adversarial interventions that are imperceptible for humans. On the other hand, smart codes constructed via error-correction methods can lead to robustness against small scale intelligent or random perturbations on them. In the recognition of smart road signs, however, humans are out of the loop since they cannot see or interpret them. Therefore, there is no equivalent concept of imperceptible perturbations in order to achieve a comparable performance with humans. Robustness against small scale perturbations would not be sufficient since the attacker can attack more aggressively without such a constraint. Under a game theoretical solution concept, we seek to ensure certain measure of guarantees against even the worst case (intelligent) attackers that can perturb the signal even at large scale. We provide a randomized detection strategy based on the distance between the decoder output and the received input, i.e., error rate. Finally, we examine the performance of the proposed scheme over various scenarios.
Index Terms:
Game theory; Autonomous driving; Traffic sign recognition; Adversarial classification; Certifiable machine learning.
I Introduction
Machine learning is one of the key enabling technologies for autonomous vehicles. An autonomous vehicle can learn how to recognize the surroundings and can base its strategic decisions on the information learnt. It is only a matter of time for autonomous driving to replace of human drivers completely. However, for the time being, there are still important, yet not completely addressed, challenges for autonomous driving. Road-sign classification is one of these challenges. Varying weather conditions, changing lighting throughout the day and occlusion are known to pose challenges to road-sign recognition/classification in real-time [1]. However, recently, it has been shown that there can also be physical adversarial modifications, e.g., stickers or graffiti, on the road signs to mislead the classification algorithms [2].
I-A Prior Literature
In the field of intelligent transportation systems, there have been extensive effort to mitigate the former challenge [3, 4, 5, 6, 7, 8, 9]. In [3], the authors have studied convolutional neural networks trained according to hinge loss stochastic gradient descent to achieve fast and stable convergence rates with substantial recognition performance. In [4] and [5], the authors have proposed text-based detection systems for traffic panels that could include information that can vary substantially. Computational complexity of the recognition algorithms plays a significant role for real-time applications since autonomous vehicles are time-critical systems [6, 7]. In [6], the authors have sought to enhance the performance of convolutional neural networks for faster performance in real-time applications through localization of the traffic-signs in the input images based on their types. In [7], the authors have proposed kernel-based extreme learning machines with deep perceptual features to achieve comparable performance to hinge-loss stochastic gradient based convolutional neural networks (proposed in [3]) with reduced computational complexity. Tree-based hierarchical structures have also been proposed to achieve coarse-to-fine road sign detection [8, 9].
Different from the previous works [3, 4, 8, 5, 6, 9, 7], however, in this paper, we seek to address the latter challenge, i.e., road-sign classification in adversarial environments, where there can be an intelligent attacker modifying road signs physically as exemplified in [2]. Especially for vision-based classification algorithms, it is an important issue that an attacker could craft the input through perturbations that are imperceptible for humans (i.e., a human would still easily classify the input correctly) while the algorithm classifies the input as the attacker’s targeted class [10, 11, 12, 13, 2]. Such an input sample is called adversarial example [10]. Recently, substantial amount of defense methods have been proposed to make machine learning algorithms robust against adversarial scenarios. These defense methods have been developed to provide robustness against certain classes of attacks, and it has been shown that it is possible to bypass them all via small modification of the attacks [12].
I-B Smart Road Signs
Our goal, here, is to achieve reliable identification of road signs by smart vehicles without limiting the solution to learning-based techniques. Particularly, we can view the road-sign classification problem from a wider perspective as a communication problem. The road sign and smart vehicle can be viewed as a transmitter and a receiver, respectively. Then, the message is the type of the road sign, the signal carrying that message is the physical road sign, and the signal received is its digital image taken by the smart vehicle. The relation in between the signals sent and received, i.e., physical road sign and its digital image, can be modeled via a noisy channel that can lead to errors in the transmitted message. Based on this viewpoint, we can reconfigure this information flow by also designing the signal, i.e., the physical road sign, via the well-established tools developed in communication theory.
How we can re-design physical road signs and which medium we can choose to transmit information are only limited by our imagination and the corresponding financial burden to adapt the infrastructure. For example, at each road sign, we could have included road side units that can transmit the message via dedicated short range communication (DSRC) radios [14, 15] although there might also be adversarial interventions on those radio signals [15, 16]. However, the future trend in road signs is to include (e.g., infrared) smart codes that can be read by smart vehicles as illustrated figuratively in Fig. 1(b) [17, 18]. Such smart codes provide flexibility in transmiting more information to smart vehicles instead of just the type of road signs.
Remark*.*
Due to the regularity of smart codes, e.g., see Fig. 1, we can identify them via image processing techniques instead of learning-based techniques. For example, in [19], the authors use a simularity measure to identify the countdown numbers in a traffic light, instead of traditional number recognition algorithms. Furthermore, quick response (QR) codes constitute another example where we can incorporate image processing techniques to identify the underlying code, e.g., [20].
We can attain reliable information transmission with certain formal guarantees when we construct the smart codes via error-correction methods. Particularly, in coding theory, error correction methods introduce redundancy to signals, i.e., codewords, in order to recover the underlying message as accurately as possible when there is a noisy channel that can perturb the signal [21]. Correspondingly, if the amount of perturbation on the smart code, due to some physical modifications as seen in Fig. 1(d), were less than half of the minimum distance between any two codewords, then we could have recovered the underlying codeword without any error. However, this is not the end of the story as explained below.
I-C Motivation
Recall that physical adversarial examples in visual tasks are defined as inputs crafted by an intelligent attacker in order to mislead the classification algorithms while a human can still classify it accurately without any difficulty [2, 13]. This challenge is important to mitigate in order to attain a comparable performance with human drivers. However, such a threat model is not appropriate for smart codes since they will not be visible to humans and even if they were visible, they are not interpretable by humans manually easily. Since humans are out of the loop, there will not be such a constraint limiting the perturbation amount on the attack that we are seeking to defend against. In the literature of communication in adversarial environments [22, 23], it is evident that it will not be sufficient for smart codes to be robust against just small scale perturbations if the adversary is powerful to launch large scale perturbation. Therefore, we introduce a new threat model where the intelligent adversary can also perturb the input at large scale in order to lead to erroneous decoding.
Given a codeword received, our goal is to detect whether there exists an adversarial intervention or not, e.g., as illustrated in Fig. 1(e). Note that the perceived codeword might differ from the original codeword not only due to adversarial intervention but also due to random perturbations that inevitably appear in the process of interpreting the infrared image of the smart code. Indeed, the presence of such random perturbations is the main reason to incorporate error correction methods while constructing the smart codes. Note also that an intelligent attacker attacks by taking the detection mechanisms into account. Correspondingly, while designing the detection strategy, we need to anticipate the reaction of the attacker. However, one-level depth reasoning where the detector designs the strategy by anticipating the reaction of the attacker would not be effective if the attacker has taken this proactive defense into account and has reacted in a way that can undermine it. In [12], the authors have shown this phenomenon by bypassing the state-of-the-art defense mechanisms through strategic modification of the attack (those mechanisms defend against). To mitigate this issue, we propose to design the detection strategies under the solution concept of game theoretical equilibrium [24]. However, this paper is definitely not the first one approaching the adversarial classification (or intervention detection) problem through a game-theoretical lens. In the following, we review these studies.
I-D Game Theoretical Approaches
In [25], the authors have introduced a non-zero sum game between an attacker and a classifier; however, they have not studied any notion of equilibrium. In [26, 27], the authors have studied adversarial prediction problems for a certain class of learners, e.g., support-vector-machines, in terms of Nash and Stackelberg equilibria, respectively. Recently, in [28], the authors have analyzed adversarial binary-classification as a non-zero sum game between an attacker and a classifier. The classifier seeks to detect whether the input is coming from the attacker or from a known benign-distribution. On the other side, the attacker seeks to maximize his reward (which depends on the input) without being detected by the classifier. The authors have shown that the classifier can restrict the strategies to mixtures of classifiers setting threshold on the reward of the attacker without any loss of generality. However, the results in [28] cannot be extended to our problem setting since the codeword attacked can also be perturbed randomly in the process of reading it. Due to that randomness, different attacks with different rewards can lead to the same codeword received. Therefore, given the received codeword, the defender cannot know the attacker’s reward to compare it against such a threshold.
I-E Our Contributions
In this paper, we propose an adversarial intervention detection mechanism for smart road signs in order to ensure reliable recognition by smart vehicles. We model the interaction between the detection mechanism and attackers as a zero-sum Stackelberg game [24] where the detector is the leader. Particularly, attackers can attack road signs by physically modifying their smart codes at large or small scales, as illustrated in Fig. 1(d), while knowing the detection mechanism. The detector seeks to minimize a performance metric that includes cost of losing the opportunity of preventing future attacks by not being able to detect it now, cost of adversary-induced decoding errors or failures, false alarm cost, and easiness of deceptive perturbations. Against the worst-case attacker who seeks to maximize the detector’s performance metric, the detector designs a randomized detection rule based on the distance between the received codeword and the decoder output, i.e., error rate.
The game theoretical solution concept yields that the detector needs to anticipate the attacker’s best reaction to the proposed detection policy. However, large size of the attacker’s strategy space can lead to computational issue while computing the best detection rule offline. To this end, we examine the attacker’s actions and show that the attacker can be viewed as selecting an action from a quotient space of the actual attack space with respect to a certain equivalence relation, which will be described in detail in Subsection IV-A. However, that quotient space can still be large to search over if there are many distinct road signs. Accordingly, we provide a method to relax the attack space to address such computational issues in Subsection IV-C. This conservative relaxation where the attacker is viewed to possess more power than in practice enables us to transform the problem into an efficient linear program (LP) with substantially smaller size. In addition to game theoretical results, we also analyze the performance of the proposed detection mechanism numerically over various scenarios.
Our main contributions are as follows:
- •
To the best of our knowledge, this is the first work in the literature to address adversarial intervention on smart road signs within a game theoretical framework.
- •
The randomized detection rule developed under the solution concept of game theoretical equilibrium ensures robustness against the worst-case attacker that attacks to maximize the cost for the detector while knowing the detection mechanism.
- •
By relaxing the attacker’s strategy space, we provide an efficient (offline) LP-based algorithm to compute the best randomized detection strategy, which can reduce the verification complexity.
The paper is organized as follows: In Section II, we provide preliminary information about error-correction coding. In Section III, we formulate the problem. In Section IV, we analyze the equilibrium of the game. We provide numerical examples in Section V. We conclude the paper and identify possible research directions in Section VI.
Nomenclature
Problem Setting:
linear block code
codeword length
message length
(minimum) distance of the code
error-correction diameter
alphabet of the symbols
alphabet size
set of all codewords
set of encoded codewords
set of decodable codewords
encoded codeword
received (noisy) codeword
decoder output for decodable
noisy channel
probability of
probability of error in a symbol
Hamming distance
Game Setting:
road-sign classification game
Attacker
Detector
’s cost function
’s randomized detection rule
attacked codeword
crafted codeword
’s (pure) action
’s action space
’s mixed strategy
loss due to decoding error/failure
multiplicative factors
II Preliminaries in Error Correction Coding
Error correction codes provide certain formal guarantees for the transmission of digital data over noisy channels as long as the deviation on the message sent due to random/intelligent noise is less than a certain threshold with respect to a certain distance metric [21]. Since a smart code can be viewed as a finite-size block and perturbations can be viewed as Boolean operations, e.g., flipped bits, we specifically consider linear block codes, which encode data in blocks. They are called linear because any linear combination of codewords is also a codeword. Formally, a linear block code, denoted by , operates over a finite alphabet of symbols whose size is denoted by , and maps symbols to symbols. The (minimum) distance of a block, denoted by , is the minimum number of positions where any two distinct codewords differ, i.e., the Hamming distance [21] in-between the distinct codewords. The abstraction of the code via enables us to study all the linear block codes in a unified way.
We emphasize that the Singleton bound [29, 30] that all linear block codes satisfy is given by
[TABLE]
where the equality holds for Reed-Solomon codes [31]. Furthermore, the minimum distance implies that the block code can detect symbol errors and correct up to
[TABLE]
symbol errors since there exists no other codeword within diameter of each codeword.
Consider that the number of symbol errors, denoted by , is more than half of the minimum distance, i.e., . We say that a decoding error exists if the Hamming distance between the received codeword and any other codeword is less than or equal to , i.e., if we decode it erroneously. Further, we say that a decoding failure exists if the Hamming distance between the received codeword and all the other codewords is more than , i.e., if the received codeword is not decodable [32].
III Problem Formulation
Consider that road signs are encoded into smart codes via a linear block code , and there exist two players: an attacker () and a detector (), as seen in Fig. 2. Given the encoding-decoding scheme and the underlying statistical profiles, seeks to detect any intervention by while seeks to modify the smart codes physically, as exemplified in Fig. 1(d), in order to lead to decoding error/failure stealthily.
Noise Model. The decoder reads a noisy version of the smart code due to, e.g., lighting-induced blurring or harsh weather conditions. Let denote the codeword space. Then, we model this noise via a probability transition mapping corresponding to the probability of receiving codeword given that the transmitted codeword is . We suppose that all the symbol errors by nature are equally likely and independent of each other. We denote the probability that there can be an error in a symbol by . We also suppose that the change of the symbol to any other symbol in the alphabet is equally likely in a symbol error.
Remark* (Symbol Error).*
In a codeword, a symbol consists of multiple contiguous bits. A symbol error occurs if at least one of these bits is perturbed. Correspondingly, if random perturbations infect multiple contiguous bits, the number of symbol errors is an appropriate distance measure. This is indeed the case in smart road signs due to possible obfuscation by plants or graffiti or adversarial stickers as studied in [2] or as illustrated in Fig. 1(d). When there are blurring due to lighting throughout a day, fading colors, or weather conditions, we would also expect perturbations on multiple contiguous bits, instead of perturbations on single isolated bits (which may require surgical-like precision due to the relatively small size of a single bit). Furthermore, error-correction codes, e.g., Reed-Solomon codes, provide effective guarantees against symbol errors.
Defense Model. has multiple objectives:
to minimize the cost of losing the opportunity to prevent future attacks by not being able to detect it now,
to minimize the cost associated with adversary-induced decoding error/failure,
to minimize the cost associated with false alarms,
to maximize the number of symbol errors necessary to deceive the decoder.
We model ’s cost function as a linear combination of these objectives with certain multiplicative factors, which give flexibility to control the weight of the corresponding objective as desired. This cost function will be defined explicitly in the game model.
We let denote the set of encoded codewords, i.e., there exists bijective relation in-between and the set of road signs. We also let denote the set of decodable codewords that are within diameter of a codeword in . If , we denote the decoder output by . The loss due to decoding error/failure is given by
[TABLE]
If , can report an issue against the possibility of adversarial intrusion so that further (costly) investigations can take place. To this end, designs a randomized detection rule , where , , corresponds to the probability of triggering an alert for symbol errors. If , further investigations always take place.
Remark* (Scalable Defense).*
We consider a randomized detection rule depending on the number of symbol errors for scalability. If were to select a (randomized) detection rule based on the received codeword, then would select a vector over the space , which is dimensional. Note that is exponential in the number of symbols whereas is linear in .
Threat Model. is the worst case attacker who maximizes ’s cost function. To this end, can select which road sign to attack. Let denote the codeword of the attacked road sign. Then, can craft to by introducing error in order to control the decoder output. The complexity of this crafting is given by the number of symbols changed, i.e., . We denote ’s action space by and denote ’s action by . can select a mixed strategy over such that denotes the probability of taking action , i.e., attacking the codeword and crafting it to .
Game Model. We consider a zero-sum game setting where seeks to minimize the cost function:
[TABLE]
against the worst-case who seeks to maximize (III). We define certain multiplicative factors , , corresponding, respectively, to ’s objectives . Note that minimization of the expected cost due to the uncertainty of the channel is in-line with the expectation-over-transformation framework proposed in [13]. The attackers can generate robust attacks by considering the expected impact of the uncertainties due to the channel [13].
Remark* (Attack Probability).*
If has a priori information corresponding to the probability of adversarial intervention, then we can incorporate this into (III) by selecting the multiplicative factors accordingly. For example, the objectives , , and matter if there is an adversarial intervention while the objective matters if there is no adversarial intervention. To this end, we can scale up , , and by while scaling up by .
We consider a hierarchical setting, where can know (or learn) ’s randomized detection algorithm, in order to avoid obscurity based defense, which can be bypassed when an advanced attacker learns the information in obscurity. Therefore, this interaction can be modeled as a Stackelberg zero-sum game111 denotes the probability simplex formed by standard unit vectors.
[TABLE]
where is the leader and is the follower. Since is the follower and reacts to ’s strategy , the problem faced by the detector is given by
[TABLE]
The following proposition shows that there exists an equilibrium to the game .
Proposition 1** (Existence Result).**
There exists a pair of ’s strategy and ’s reaction attaining the Stackelberg equilibrium , i.e., satisfying (6).
Proof.
Note that is linear, and correspondingly, continuous in the optimization arguments and , and the constraint sets are decoupled. Therefore the maximum theorem [33] yields that
[TABLE]
is a continuous function of . Then, since is a compact set, the extreme value theorem yields that there exits a solution for (6). ∎
In the following section, we analyze the equilibrium to .
IV Adversarial Intervention Detection Across Smart Road Signs
Existence of an equilibrium is guaranteed as shown in Proposition 1. However, computation of the equilibrium can be demanding (even if it is an offline computation) since has a large strategy space, even for short codewords. In this section, our goal is to examine ’s best response for efficient computation of the best detection rule. To this end, we first seek to formulate certain equivalence classes over ’s actions such that all the actions in a class lead to the same outcome of the game (see, Subsection IV-A). However, depending on the size of the input space, i.e., , computation of the equilibrium may still be demanding for that quotient space. In order to avoid such a computational issue for long codewords that can express relatively larger collection of road signs, we relax the constraints on ’s action space, which will lead to more powerful attacker than in practice (see, Subsection IV-C). This yields a conservative defense, which leads to lower cost against the actual attacker who is relatively less powerful in run-time applications. Finally, we transform the problem into an efficient LP, rather routinely, in order to apply existing powerful computational tools to compute the best detection rule. We now provide the details of these steps.
Our goal is to compute the best detection rule with respect to the equilibrium (6). To this end, needs to anticipate ’s reaction to any selected detection rule. However, ’s action space has dimension , which is exponential in . Therefore, finding the best reaction, i.e., a vector in , for each detection rule is computationally demanding. Accordingly, in the following, we seek to reduce ’s strategy space without losing the generality.
IV-A Equivalence Classes on ’s Best Response
Fig. 3 provides a figurative illustration of how can attack. Note that error-correction methods ensure that different encoded codewords are at least a certain number of symbols away from each other. For the code in Fig. 3, this minimum distance is symbols as exemplified in between and . Any perturbation that can change at most symbols does not lead to any decoding error or failure. However, at certain directions, perturbations on symbols can lead to a decoding error, e.g., by carrying to the decodable region of , or a decoding failure depending on how and which symbols are perturbed. Correspondingly, if decides to attack on , then has several choices while physically damaging the corresponding smart code. In the following, we categorize those choices into four main groups:
may not attack, i.e., may not introduce any error.
may introduce relatively smaller amount of symbol error(s) such that the corrupted codeword is still in the decodable region of the associated codeword. Due to the channel, this can still lead to decoding error or failure with certain probabilities.
may introduce symbol errors such that the corrupted codeword becomes not decodable.
may introduce relatively larger amount of symbol errors such that the codeword intervened is in the decodable region of another codeword.
Even if there were not any detection rule, in Fig. 3, we observe that not attacking, i.e., , or attacking relatively more aggressively, e.g., and , are not necessarily more preferable for than due to the noisy perturbations and the trade-off between the amount of perturbation and the gain of by decoding error/failure. In the following, we examine the channel, which can lead to such intriguing results.
Recall that any symbol can be perturbed by the channel with the same probability while the symbol perturbed can change to any other symbol in the alphabet with the same probability, i.e., . Then, the probability that turns into due to noisy channel can be written as
[TABLE]
which only depends on the distance in-between and . Particularly, there are symbols that match at and . There should not be any perturbations on those symbols, which leads to the first multiplicative term on the right-hand-side of (8). For each symbol that does not match, the random perturbation must change the one at to the corresponding one at among equally likely alternatives, which leads to the second multiplicative term.
Since only depends on , we define an auxiliary metric , where denotes the probability that two codewords that are symbols away become symbols away when one of them is randomly perturbed by the channel. Note that we can compute based on combinatorics analytically or using the Monte Carlo method [34] numerically. With this new auxiliary metric, let us take a closer look into the objectives and , where the channel and have impact on. Firstly, the term in parenthesis in can be written as
[TABLE]
where the under-braced term corresponds to the total probability that moves to symbols away from an encoded codeword due to the random noise. Similarly, we can write the inner summation in as
[TABLE]
where the first line follows since a detection error or failure occurs if the received codeword is more than symbols away from the codeword before crafts it into . The second line follows by the definition of the new auxiliary metric .
Note that written according to (9) depends on the distance between and all the encoded codewords . Similarly, only the distance between and has an impact on and while we also have . Therefore, the attacks that target have the same impact on the cost function (III) if the distances between and the encoded codewords are the same.
Indeed, there is a strong coupling on how would select and independent of ’s strategy. Particularly, (9), and correspondingly , do not include . On the other side, the objectives and , which include , do not include . Therefore for given , can select irrespective of ’s detection rule. Based on this observation in the following lemma, we eliminate weakly dominated actions of in order to reduce the size of ’s strategy space.
Lemma 1**.**
In the game , without loss of generality, we can restrict ’s action space into
[TABLE]
where is the maximizer of the optimization problem:
[TABLE]
Proof.
The terms corresponding to and in (III) can be written as
[TABLE]
which follows by (11). Since seeks to maximize the cost (III), for each , we can compute the associated optimal attacked codeword, i.e., , via (13), where a solution is guaranteed to exists since the constraint set is finite. ∎
Note also that (9), (11), and (13) depend only on the distances between and all the encoded codewords. Therefore, for a given detection rule, any other that has the same set of distances to the encoded codewords would lead to the same cost (III). Correspondingly, we define another auxiliary function such that is a vector whose th entry, denoted by , corresponds to the distance in-between and the th encoded codeword (with respect to a certain order in ). Then, given , (13) can be written as
[TABLE]
where denotes the set including the entries of the vector . We can view as the reward of for . Therefore, by (9), (11), and (15), the cost function can be written as
[TABLE]
The following lemma recaps these results to formulate the equivalence classes on ’s best response.
Lemma 2**.**
Without loss of generality, instead of mixing over , can select a mixed strategy across the quotient set with respect to the following equivalence relation:
[TABLE]
for some permutation matrix .
Proof.
The cost function, as written in the form of (16), depends on only with respect to the distances between and the encoded codewords while the specific identities of the encoded codewords do not impact the cost function. Correspondingly, any permutation of the distances across the encoded codewords would lead to the same cost. ∎
In order to facilitate the analysis of the equilibrium, we, next, seek to write the cost function (20) in a compact form.
IV-B Equilibrium in Compact Form
Up to now, we have focused on the objectives except , on which ’s strategy does not have direct impact. Similar to (9), via the auxiliary functions and , we can write as
[TABLE]
By including (19) in (16) and invoking Lemma 2, we can write the cost function in a way that confines the impact of the channel into the auxiliary metric :
[TABLE]
Consider a certain order over the quotient space such that, with a slight abuse of notation, corresponds to the mixed strategy for the th action in . For notational simplicity, we also let and . Then, (20) can be written as
[TABLE]
which can also be transformed into a compact vectoral form. To this end, we define the vectors and , whose th entries are given by and , respectively. We also introduce the matrices and whose th row and th column entries are given by
[TABLE]
respectively. We note the shift at the column entries since we have , instead of . Then, we can write (20) as
[TABLE]
which facilitates the computation of the equilibrium. However, the size of can lead to computational issues for long codewords, i.e., large , even though it has relatively smaller size compared to without losing any generality as shown in Lemma 2. To mitigate this issue, in the following, we relax the attack space so that the size of the problem can be reduced further based on the derived equivalence relation (17).
IV-C Relaxing Attack Space at Large Scales
The cost function in the compact form (23) implies that we need to focus on the first and second additive terms that include ’s mixed strategy in order to reduce ’s strategy space. Note that on those additive terms, is multiplied by the matrix and the vector . We can seek to exploit certain properties of and . To this end, we will first show that the matrix can be written as in (28), where is a vector which can be viewed as the histogram of the distances from to the encoded codewords, i.e., . Next, we will examine in order to formulate necessary conditions on the histogram . By only considering those necessary conditions, we relax ’s strategy space such that he/she selects a mixed strategy from a strategy space with substantially smaller size. Now, we provide the technical details step by step.
Step-. A Closer Look at the Matrix : Recall that the th row and the th column entry of is given by
[TABLE]
where the summation is taken across all the encoded codewords. However, we can separate this summation into sub-summations with respect to the distance from to the encoded codewords. In particular, we have
[TABLE]
where denotes the number of encoded codewords that are symbols away from , and the second line follows since for all that satisfies . Correspondingly, for each , , we define the following dimensional vector
[TABLE]
Then, (26) and (27) yield that can be written as
[TABLE]
Note that all the entries of , , are non-negative integers and sum to the number of all encoded codewords . However, these are not necessarily sufficient conditions. Note also that we can view the vector as the histogram of the entries of . Based on this observation, in the following, we seek for tighter necessary conditions on by examining .
Step-. An Upper Bound on : We first examine the minimum possible distance between an arbitrary codeword and an encoded codeword. Note that a codeword consists of the message and redundantly added symbols:
[TABLE]
For each message in , there exists a unique encoded codeword. Correspondingly the minimum distance between an arbitrary codeword and encoded codewords, i.e., , can be at most since the message part of matches with at least one encoded codeword completely. Therefore, formally, we have
[TABLE]
Step-. A Gap in the Ordered : Since the codewords are encoded such that they are distributed across with maximum distance in between them, if an arbitrary codeword is relatively close to one of the encoded codewords, e.g., if it is inside the decodable region, the distances between that arbitrary codeword and the other encoded codewords are relatively large. In other words, when we list all the distances from that arbitrary codeword to the encoded codewords in ascending order, then there will be a jump between the distance to the closest one and the distance to the second closest one. For example, if , then there is no other encoded codeword within a diameter of symbols away from .
Particularly, if is in a decodable region of an encoded codeword, e.g., ; then is the encoded codeword closest to , i.e., , and there exists only that encoded codeword within diameter.
Step-. A Contiguousness Assumption on the Ordered : We have formulated certain necessary conditions on the distance from an arbitrary codeword to the closest and second closest encoded codewords. For the distances to the other encoded codewords, we observe that at large scales, the number of messages , i.e., the number of encoded codewords, is significantly larger than the length of the codewords . We suppose that if , then there exists at least one encoded codeword at the distances . Otherwise, i.e., if , there exists at least one encoded codeword at all the distances starting from the closest one to . In particular, formally, we suppose that
[TABLE]
since if is in the decodable region of an encoded codeword, i.e., .
Step-. The Histogram Under Necessary Conditions: Based on the necessary conditions derived in Steps -, in the following, we formulate the necessary conditions on under two cases depending on . If is in a decodable region, i.e., , then we have
[TABLE]
where corresponds to an unspecified positive integer. If is not in any decodable region, i.e., , then we have
[TABLE]
Step-. Approximation Under Necessary Conditions: Note that the unspecified entries of may not necessarily take arbitrary values; however, we will relax this and suppose that the unspecified entries can be set to arbitrary values by as long as they are all positive and all entries sum to . As an illustration, when we concatenate for different scenarios where varies from [math] to , we obtain the following matrix:
[TABLE]
where the entry denoted by is [math] if is even, and is an unspecified positive integer if is odd. For example, the first column corresponds to whose , which yields that the second closest encoded codeword can be as close as symbols away.
Step-. ’s Relaxed Strategy Space: Based on the relaxation that the unspecified entries can take any values, we seek to reduce ’s strategy space, which is the main reason of all the steps we have taken up to now. To this end, we first recall that the unspecified entries are all positive and add up to a certain number, which is if , and otherwise. Let us consider an arbitrary column in (34), e.g., th column where . Then, the set of all such possible is given by
[TABLE]
Correspondingly, the set of extreme points222We say that a point in a convex set is an extreme point if it cannot be expressed as a convex combination of two other points from that set. of this set is given by
[TABLE]
Note that any point in the set (35) can be expressed as a convex combination of its extreme points identified in (36).
Therefore, we can express any convex combination of , , by a convex combination of the columns of the matrix333We suppose that is odd, i.e., . The matrix for the cases where is even can be computed accordingly. defined in (49), where
[TABLE]
and by the Singleton bound (1). In other words, under the relaxation, for any given mixed strategy across , there exists a mixed strategy over the columns of such that we have
[TABLE]
which, by (28), yields that
[TABLE]
where , as defined in (28).
Step-. A Closer Look at the Vector : Next, we seek to compute the reward . Recall that the reward for depends only on , as defined in (15). However, depends only on as shown in Steps -. Therefore, (31) yields that depends mainly on the distance to the closest encoded codeword. Based on (15) and (31), we define an auxiliary vector , where , for , is given by
[TABLE]
where denotes the disjunction operation. This yields that corresponds to the reward when .
Note that for all that have , the associated reward is . Therefore, with the mixed strategy introduced in Step-, we have
[TABLE]
where , and the dimensions of the vector at the th row is if , and if .
Step-. Transforming ’s Strategy Space to a Simplex at a Higher Dimensional Space: Our goal, here, is to transform ’s strategy into a mixed strategy at a higher dimensional space in order to be able to transform the problem into an LP as will be explained in detail later in this section. To this end, we can view as selects mixed strategies over two element sets, e.g., . This yields that selects a mixed strategy over the Cartesian product space of these sets, i.e., , which is
[TABLE]
dimensional. For example, for , the corresponding mixed strategy, denoted by , is over . This yields that there exists a matrix such that . As an example, for , we have
[TABLE]
Step-. New Compact Form: Eventually, for the relaxed attack strategies, we can write ’s cost function in the following compact form:
[TABLE]
where , , and
[TABLE]
which follows since we have and , which yields, e.g., .
In the following lemma, we provide an LP to compute the best detection rule.
Lemma 3**.**
After the relaxation of ’s strategy space, the best detection rule is given by
[TABLE]
where is the solution of the following LP:
[TABLE]
where the positive matrix444We say that a matrix is positive if its all entries are positive. is defined by
[TABLE]
where and is the minimum entry of .
Proof.
Note that by definition, we have
[TABLE]
We are interested in only the upper value since we seek to compute the Stackelberg equilibrium where selects the strategy by knowing ’s strategy . However, since the objective functions are linear in the optimization arguments while the constraint sets are convex, decoupled, and compact, the minimax theorem [24] shows that
[TABLE]
which implies that the upper and lower values of the game are equal and that we have a saddle-point equilibrium in (45). Therefore, we can apply rather routine transformation of mixed-strategy equilibrium of zero-sum matrix games into an LP [24] in order to compute the best detection rule.
A sketch of the routine transformation of (45) into an LP [24] is as follows: we show that the game (45) is strategically equivalent to a game where the game matrix is a positive matrix; we can write (45) as the minimization of ’s best response; we can obtain a certain necessary condition on in terms of ’s best response since is a simplex; through a change of variable, we can obtain the equivalent LP (48). ∎
Corollary 1**.**
The solution for the dual problem of (48), i.e.,
[TABLE]
yields that
[TABLE]
In the following section, we analyze the performance of the proposed detection mechanism numerically for various scenarios.
V Numerical Examples
The proposed framework can be applied to any linear block code since the analytical results are based only on the abstraction of the code, i.e., . Therefore, in order to compute the detection rule, specific to the underlying encoding-decoding scheme, we need the configuration of the code, i.e., , and the matrix , as defined in (28). At run-time, the detection mechanism triggers an alert based only on the number of mismatched symbols.
As numerical examples, in this section, we examine the performance of the proposed detection mechanisms for the smart codes that are constructed via Reed-Solomon coding [31]. Particularly, Reed-Solomon code (RS-Code) is a maximum distance separable code that maximizes the minimum distance between any two distinct codewords within the general class of linear block codes, and it has widely used applications, e.g., QR codes. The minimum distance of RS-Code is given by
[TABLE]
which is the Singleton bound (1) for the linear block codes. In practical implementations, the alphabet size is in general selected a prime power and length of codeword is set , e.g., often .
As illustrative examples, we examine the performance of the proposed detection mechanism for the RS-Codes: , , , , , and such that the corresponding distances for the decodable regions are given by , respectively. For each RS-Code, Table I tabulates the maximum number of distinct road signs that can be encoded, the number of bits in the codeword, i.e., (which can give an idea about the size of the associated smart code), dimensions of the mixed strategies and , and the decodable distance . The number of distinct road signs that a code can express is not directly related to the decodable distance. For example, can encode as much as around million distinct road signs, but its decodable distance is , which is less than the decodable distance of , which can encode as much as distinct road signs.
In order to examine the performance across a range of relative small to high noise channels, we consider different channels with probabilities of symbol errors: . For each channel, in Table II, we tabulate the probability of decoding error/failure for the RS-Codes. We highlight the error/failure probabilities that are less than . Note that a high error rate yields that the associated code is not reliable even when there is no adversarial intervention. Correspondingly, if a code leads to a higher error/failure probability, then we can prefer codes that include more redundancy to improve reliability.
Next, we compare the reliability of the smart road signs with respect to the cost metric (III) for the cases with and without the proposed detection mechanism. For example, we set all the road signs equally likely and we use the Monte Carlo method to compute over independent trials. We set the multiplicative factors as
[TABLE]
where we set the weight of the objective , i.e., false alarm cost, in order to keep the probability of false alarm at a certain range, e.g., less than , for the scenarios where the code has error/failure probability less than . In order to solve the LPs numerically, we use CVX, a package for specifying and solving convex programs [35, 36].
In order to compute the (conservative) cost for the scenarios where there is no detection mechanism, we compute
[TABLE]
where
[TABLE]
Particularly, since the right most column of is a zero vector, as exemplified in (44), yields that . In Table III, we tabulate the (conservative) cost of the codes over the channels examined if there were no detection mechanism. Note that we highlight the cells corresponding to the error/failure probabilities less than in order to distinguish the scenarios where the associated RS-Code can be used reliably. In Table IV, we tabulate the conservative cost of the codes if there were the proposed detection mechanism. The corresponding false alarm rates are provided in Table V. A comparison of Tables III and IV shows a substantial decrease in the conservative cost at the expense of a false alarm rate less than in the scenarios where the error/failure probability is less than .
Remark*.*
We propose a way to relax certain constraints on the attack space to mitigate the scalability issue. It can be possible to obtain tighter approximations by considering tighter necessary conditions on ’s actions; however, this would also increase computational complexity.
Furthermore, in Figs 4, 5, and 6, we provide the probability of the number of symbol errors due to channels, the proposed detection rule, and the relax attack strategy, respectively. We observe that triggers alerts if the error rate is relatively high in general, which turns out to restrain the (powerful) attacker to put more weight on the left most columns of , as defined in (49), in his/her (relaxed) attack strategy. Particularly, at the equilibrium, the powerful attacker ends up crafting the smart code relatively more aggressively similar to the choice as discussed in Subsection IV-A. Depending on the channel and the configuration of the code, the optimal detection rules can vary. Through the proposed mechanism, based on a game theoretical analysis, we can compute the best detection rule efficiently and systematically even at scales of around million distinct road signs using an average personal computer without difficulty.
VI Conclusion
A future trend in intelligent transportation systems is smart road signs equipped with smart codes. In addition to incorporating relatively larger amount of information, smart codes constructed via error-correction methods can provide robustness against small scale perturbations. We have introduced a game theoretical adversarial intervention detection mechanism for reliable smart road signs against threats that can perturb the smart codes at small or large scales intelligently. While designing the detection mechanism, we have considered multiple performance metrics regarding the cost associated with losing the opportunity of preventing future attacks by not being able to detect the attack, the cost associated with adversary-induced decoding error or failure, the false alarm cost, and the ease of a deceptive perturbation. We have designed the detection rule against the worst-case attacker who maximizes the cost metrics by knowing the designed defense, i.e., under the solution concept of Stackelberg equilibrium where the defender is the leader. We have provided a relaxation on the attacker’s strategy space in order to mitigate possible computational issues that might arise while computing the equilibrium when there is a large number of distinct road signs. This has enabled us transform the problem into an LP with considerably small computational complexity. Finally, we have examined the performance numerically over various scenarios.
The proposed game theoretical framework brings in new research directions for the applications of smart road signs in intelligent transportation systems. In the following, we identify some of these future research directions:
- •
We emphasize that sensor fusion where we collect information through several separate sources can lead to more resilient and robust systems [37]. In the future, smart road signs combined with state-of-the-art vision-based road-sign recognition algorithms can provide both reliable and effective recognition by smart vehicles.
- •
A network of smart vehicles can lead to more reliable traffic networks. Particularly, a detection mechanism faces a trade-off between detecting an adversarial intervention and avoiding false alarms. Since a road sign would be encountered by multiple smart vehicles, those vehicles can share the false alarm cost against an attack on the road sign. Similar to herd immunity [38], a herd of smart vehicles can achieve more reliable road sign recognition.
- •
Additionally, this approach can also be a good fit for other classification problems that can be viewed as a signaling problem, where we can incorporate visual smart codes while transmitting information. For example, computer vision for (warehouse) inventory management [39] or intelligent robotic sorting [40] would constitute other interesting applications for the framework developed here.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Mogelmose, M. M. Trivedi, and T. B. Moeslund, “Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey,” IEEE Transactions on Intelligent Transportation Systems , vol. 13, no. 4, pp. 1484–1497, 2012.
- 2[2] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song, “Robust physical-world attacks on deep learning visual classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2018.
- 3[3] J. Jin, K. Fu, and C. Zhang, “Traffic sign recognition with hinge loss trained convolutional neural networks,” IEEE Transactions on Intelligent Transportation Systems , vol. 15, no. 5, pp. 1991–2000, 2014.
- 4[4] A. González, L. M. Bergasa, and J. J. Yebes, “Text detection and recognition on traffic panels from street-level imagery using visual appearance,” IEEE Transactions on Intelligent Transportation Systems , vol. 15, no. 1, pp. 228–238, 2014.
- 5[5] J. Greenhalgh and M. Mirmehdi, “Recognizing text-based traffic signs,” IEEE Transactions on Intelligent Transportation Systems , vol. 16, no. 3, pp. 1360–1369, 2015.
- 6[6] Y. Yang, H. Luo, H. Xu, and F. Wu, “Towards real-time traffic sign detection and classification,” IEEE Transactions on Intelligent Transportation Systems , vol. 17, no. 7, pp. 2022–2031, 2016.
- 7[7] Y. Zeng, X. Xu, D. Shen, Y. Fang, and Z. Xiao, “Traffic sign recognition using kernel extreme learning machines with deep perceptual features,” IEEE Transactions on Intelligent Transportation Systems , vol. 18, no. 6, pp. 1647–1653, 2017.
- 8[8] C. Liu, F. Chang, and Z. Chen, “Rapid multiclass traffic sign detection in high-resolution images,” IEEE Transactions on Intelligent Transportation Systems , vol. 15, no. 6, pp. 2394–2403, 2014.
