Automatic Target Recognition Using Discrimination Based on Optimal   Transport

Ali Sadeghian; Deoksu Lim; Johan Karlsson; Jian Li

arXiv:1904.03534·cs.CV·April 9, 2019

Automatic Target Recognition Using Discrimination Based on Optimal Transport

Ali Sadeghian, Deoksu Lim, Johan Karlsson, Jian Li

PDF

Open Access

TL;DR

This paper explores the application of optimal transport distances, specifically the Monge-Kantorovich distance, for automatic target recognition in SAR images, demonstrating its effectiveness over traditional l2 distance.

Contribution

It introduces a novel use of Monge-Kantorovich distance for classifying targets with spectral data, including a formulation for spectra with different total mass.

Findings

01

Monge-Kantorovich distance improves classification accuracy.

02

Efficient algorithms enable practical computation of the distance.

03

Spectral distances based on optimal transport are robust for target recognition.

Abstract

The use of distances based on optimal transportation has recently shown promise for discrimination of power spectra. In particular, spectral estimation methods based on l1 regularization as well as covariance based methods can be shown to be robust with respect to such distances. These transportation distances provide a geometric framework where geodesics corresponds to smooth transition of spectral mass, and have been useful for tracking. In this paper, we investigate the use of these distances for automatic target recognition. We study the use of the Monge-Kantorovich distance compared to the standard l2 distance for classifying civilian vehicles based on SAR images. We use a version of the Monge-Kantorovich distance that applies also for the case where the spectra may have different total mass, and we formulate the optimization problem as a minimum flow problem that can be computed…

Tables1

Table 1. Table 1 : Time to compute W κ , c subscript 𝑊 𝜅 𝑐 W_{\kappa,c} of two images using the two algorithms (in seconds).

	29 $\times$ 24 pixels			58 $\times$ 48 pixels
	$κ = 1$	$κ = 16$	$κ = 32$	$κ = 1$	$κ = 16$	$κ = 32$
CVX	47.53	47.79	47.62	698.6	802.5	817.8
CPLEX	0.018	0.040	0.062	0.244	0.631	0.927

Equations25

Π (f_{0}, f_{1})

Π (f_{0}, f_{1})

T_{c} (f_{0}, f_{1}) = M \in Π (f_{0}, f_{1}) min x_{0}, x_{1} \in Ω \sum m (x_{0}, x_{1}) c (x_{0}, x_{1}) .

T_{c} (f_{0}, f_{1}) = M \in Π (f_{0}, f_{1}) min x_{0}, x_{1} \in Ω \sum m (x_{0}, x_{1}) c (x_{0}, x_{1}) .

W_{p, d} (f_{0}, f_{1}) = T_{c} (f_{0}, f_{1})^{m i n (1, \frac{1}{p})}

W_{p, d} (f_{0}, f_{1}) = T_{c} (f_{0}, f_{1})^{m i n (1, \frac{1}{p})}

\tilde{T}_{c, κ} (f_{0}, f_{1}) := ∥ g_{0} ∥_{1} = ∥ g_{1} ∥_{1} in f T (g_{0}, g_{1}) + κ j = 0 \sum 1 ∥ f_{j} - g_{j} ∥_{1} .

\tilde{T}_{c, κ} (f_{0}, f_{1}) := ∥ g_{0} ∥_{1} = ∥ g_{1} ∥_{1} in f T (g_{0}, g_{1}) + κ j = 0 \sum 1 ∥ f_{j} - g_{j} ∥_{1} .

\tilde{T}_{c, κ} (f_{0}, f_{1})

\tilde{T}_{c, κ} (f_{0}, f_{1})

M 1_{K} = g_{0}

φ minimize (u, v) \in E \sum \overset{c}{^} (u, v) φ (u, v)

φ minimize (u, v) \in E \sum \overset{c}{^} (u, v) φ (u, v)

(v, u) \in E \sum φ (v, u) - (u, v) \in E \sum φ (u, v) = d (v) for v \in V,

φ (u, v) \geq 0 \mbox f or (u, v) \in E .

d (w) = ∥ f_{1} ∥_{1} - ∥ f_{0} ∥_{1} .

d (w) = ∥ f_{1} ∥_{1} - ∥ f_{0} ∥_{1} .

\overset{φ}{^} (u, v) = φ (u, v) \mbox f or a l l u \in F_{0} \ {\overset{u}{^}}, v \in F_{1} \ {\overset{v}{^}}

\overset{φ}{^} (u, v) = φ (u, v) \mbox f or a l l u \in F_{0} \ {\overset{u}{^}}, v \in F_{1} \ {\overset{v}{^}}

\overset{φ}{^} (\overset{u}{^}, w) = φ (\overset{u}{^}, w) + φ (\overset{u}{^}, \overset{v}{^})

\overset{φ}{^} (w, \overset{v}{^}) = φ (w, \overset{v}{^}) + φ (\overset{u}{^}, \overset{v}{^})

\overset{φ}{^} (\overset{u}{^}, \overset{v}{^}) = 0

(u, v) \in E \sum \overset{c}{^} (u, v) φ (u, v) - (u, v) \in E \sum \overset{c}{^} (u, v) \overset{φ}{^} (u, v)

(u, v) \in E \sum \overset{c}{^} (u, v) φ (u, v) - (u, v) \in E \sum \overset{c}{^} (u, v) \overset{φ}{^} (u, v)

= φ (\overset{u}{^}, \overset{v}{^}) (\overset{c}{^} (\overset{u}{^}, \overset{v}{^}) - 2 κ) > 0.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Medical Imaging Techniques and Applications · Target Tracking and Data Fusion in Sensor Networks

MethodsL1 Regularization

Full text

Automatic Target Recognition Using

Discrimination Based on Optimal Transport

Abstract

The use of distances based on optimal transportation has recently shown promise for discrimination of power spectra. In particular, spectral estimation methods based on $\ell_{1}$ regularization as well as covariance based methods can be shown to be robust with respect to such distances. These transportation distances provide a geometric framework where geodesics corresponds to smooth transition of spectral mass, and have been useful for tracking.

In this paper we investigate the use of these distances for automatic target recognition. We study the use of the Monge-Kantorovich distance compared to the standard $\ell_{2}$ distance for classifying civilian vehicles based on SAR images. We use a version of the Monge-Kantorovich distance that applies also for the case where the spectra may have different total mass, and we formulate the optimization problem as a minimum flow problem that can be computed using efficient algorithms.

**Index Terms— ** Optimal transport, Automatic target recognition, SAR, Power spectra.

1 Introduction

In our information society there is an ever increasing stream of images, and automatic processing is a key to analyze and utilize this information efficiently. It is therefore essential to quantify differences and similarities in images in a mathematically sound way. Estimation methods for radar and sonar imaging are often based on statistical quantities, and it is therefore natural to demand that a “small’ change in the spectral content results in a small change in relevant statistical quantities. This is not the case for many standard metrics where a small shift in the frequency of a spectral line results in a significant change in, e.g., the $\ell_{1}$ or the $\ell_{2}$ norm of the spectral difference.

In this paper we focus on the Monge-Kantorovich distance [1], also known as the earth movers distance in the computer science community; a distance which is rooted in optimal transport and which has shown promise for both tracking and classification [2, 3, 4, 5, 6, 7] and is a distance that is robust with respect to measurement error [8, 9]. In particular, for data-direct high resolution spectral estimation methods such as sparse methods based on $\ell_{1}$ -regularization [10, 11] the magnitude of the true solution can be robustly recovered if the error is quantified using the Monge-Kantorovich distance and the support of the true signal is sparse and with separated components [9]. For these problems, the so-called dictionary is by necessity highly coherent and no useful bounds can be obtained in terms of the $\ell_{p}$ norms [12]. The Monge-Kantorovic distance, does not just compare images point by point, but instead penalizes the total transport of mass. Also for covariance based methods, distances such as the Monge-Kantorovic distance have been shown to be robust with respect to measurement error and robustness bounds are computable [13].

In this paper we consider automatic target recognition (ATR) of vehicles, where the goal is to analyze a SAR image of a parking lot and determine if a given car in the parking lot is a sedan, a sports utility vehicle (SUV), or a van. We compare the recognition rate using the Monge-Kantorovic distance to the recognition rate obtained using the classical $\ell_{2}$ distance. Section 2 gives a background where the transportation distance is defined. In Section 3 we reformulate the optimization problem of computing the transportation distance as a minimum cost flow problem. In Section 4 the automatic recognition problem is presented and we describe the classification procedure. Finally, the results are presented in Section 5, and Section 6 contains concluding remarks.

2 Background

The Monge-Kantorovich distance represents the minimal transportation cost of moving one “mass” distribution to another with specified cost of moving one unit amount of mass from one location to another [1].

Consider two $K$ -dimensional element-wise non-negative vectors $f_{0}$ and $f_{1}$ that each represent a distribution of “mass” at the locations $x\in\Omega$ . Let $m(x_{0},x_{1})$ denote the amount of mass transported from location $x_{0}$ to location $x_{1}$ , and we say that $M=(m(x_{0},x_{1}))_{x_{0},x_{1}\in\Omega}\in{\mathbb{R}}^{K\times K}$ is a feasible transportation plan from $f_{0}$ to $f_{1}$ if the respective marginals are equal to $f_{0}$ and $f_{1}$ , i.e., if $M$ is in the set

[TABLE]

Let $c(x_{0},x_{1})$ represent the cost of transferring one unit of mass from location $x_{0}\in\Omega$ to location $x_{1}\in\Omega$ , and define the matrix of transportation costs by $C:=[c(x_{0},x_{1})]_{x_{0},x_{1}\in\Omega}\in{\mathbb{R}}^{K\times K}$ . Then the minimum cost of transporting mass with distribution $f_{0}$ to a distribution $f_{1}$ is

[TABLE]

This is known as the Monge-Kantorovich distance [14]. Monge-Kantorovich distances are not metrics in general, but they readily give rise to a class of the so-called Wasserstein metrics:

[TABLE]

where the cost function is of the form $c({x_{0}},{x_{1}})=d({x_{0}},{x_{1}})^{p}$ , and where $d$ is a metric on $\Omega$ and $p\in(0,\infty)$ [1].

The Monge-Kantorovich theory deals with mass distributions of equal mass. However, they can be generalized to distances for distributions of possibly unequal masses as follows [8]. Given the two mass distributions $f_{0}$ and $f_{1}$ , we postulate that these are perturbations of two other mass distributions $g_{0},g_{1}\in{\mathbb{R}}^{K}$ , that have equal mass. Then, the cost of transporting ${f_{0}}$ and $f_{1}$ to one another can be thought of as the cost of transporting $g_{0}$ and $g_{1}$ to one another plus the size of the respective perturbations:

[TABLE]

These distances have several interesting properties. They are weak∗ continuous hence may be used to localize spectral mass [1, 13]. They are contractive with respect to additive and normalized multiplicative noise, reflecting the fact that noise impedes the ability to discriminate. [8]. Furthermore, they have additional properties relating to deformations of spectra and smoothness with respect to translation. More specifically geodesics (e.g., the Wasserstein-2 metric) preserve “lumpiness.” A consequence of this is that when linking power spectra via geodesics of the metric, the corresponding peaks often seem to be “matched” and the power between those transfer in a consistent manner. Such a property appears highly desirable in morphing for, e.g., tracking of frequencies in a slowly time-varying signal and integrating data from a variety of sources (see, e.g., [15, 16, 5]). See also [17] for a matrix valued extension.

3 Computation of the Monge-Kantorovich distance

The computation of the Monge-Kantorovich distance is a linear optimization problem and can in principle be computed using any standard convex optimization software. We can write the Monge-Kantorovich distance (2) as:

[TABLE]

where $M\in{\mathbb{R}}^{K\times K}$ is a matrix that represents the transportation plan from $g_{0}$ to $g_{1}$ , and $C=[c(x_{i},x_{j})]_{x_{i},x_{j}\in\Omega}\in{\mathbb{R}}^{K\times K}$ is the cost matrix that contains the costs of moving a unit of mass from one point to another. Here $\geq_{\rm e}$ denotes element-wise inequality and ${\mathbf{1}}_{K}$ is the $K\times 1$ vector of ones.

One challenge here is the computational burden of computing the distances for large $K$ . However, it is well known that the optimal transport problem can be posed as a minimal cost flow problem (see, e.g., [18]). We will here show that this approach may be modified to include the optimization problem (2), hence allowing for the use of efficient specialized network algorithms for fast computations [19].

3.1 Monge-Kantorovich Distance as a Network Simplex Problem

In this section we will describe how the Monge-Kantorovich distance (2) can be formulated as a minimum cost flow problem. Finding the minimum-cost flow consists of determining the cheapest way to transport a given supply to a given demand through a graph, and such problems can be solved efficiently.

More specifically, a minimum-cost flow problem is formulated as follows. Let $G=(V,E)$ be a directed graph with a cost $\hat{c}(u,v)$ associated with each edge $(u,v)\in E$ . Then associate each node $v\in V$ with a number $d(v)\in{\mathbb{R}}$ corresponding to the supply of that node if $d(v)>0$ and the demand of that node if $d(v)<0$ . The problem is then to find the flow, ${\varphi:E\rightarrow\mathbb{R}_{\geq 0}}$ , that matches the supply to the demand with minimal total cost:

[TABLE]

Next, we will formulate (3) as a minimum cost flow problem. Let each of the two sets $\mathcal{F}_{0}=\{u_{i}:\,i=1,\ldots,K\}$ and $\mathcal{F}_{1}=\{v_{i}:\,i=1,\ldots,K\}$ correspond to the set of sample point of $x_{i}\in\Omega$ , and let $d(v_{i})=f_{0}(x_{i})$ and $d(u_{i})=-f_{1}(x_{i})$ , for $1\leq i\leq K$ , be the corresponding supply or demand. Let $G_{0}=(V_{0},E_{0})$ be the complete bipartite di-graph with bipartition $\mathcal{F}_{0}$ and $\mathcal{F}_{1}$ . The cost of the edge connecting $u_{i}\in\mathcal{F}_{0}$ to $v_{j}\in\mathcal{F}_{1}$ is assigned as $\hat{c}(v_{i},u_{j})=c(x_{i},x_{j})$ in (3), i.e. the distance between $x_{i}$ and $x_{j}$ . The minimum cost flow problem (4) corresponding to $G_{0}$ with costs $\hat{c}$ and demand/supply rates $d$ corresponds to the standard transportation problem (1).

In order to allow for mass perturbations (2) we will add an extra node. To this end, let $G=(V,E)$ where $V=V_{0}\cup w$ , and let $w$ be connected to every other node in $V_{0}$ , i.e, $E=E_{0}\cup\{(w,v)\cup(v,w),v\in V_{0}\}$ . Further, let the cost of the edges be $\hat{c}(w,v)=\hat{c}(v,w)=\kappa$ for $v\in V_{0}$ , and let the demand of $w$ be

[TABLE]

By introducing this demand the total demand and supply add up to zeros also when $f_{0}$ and $f_{1}$ has different total mass.

One can easily see that the minimum cost flow of $G$ will equal to the transportation cost $T_{\kappa,c}(f_{0},f_{1})$ . In this setting, the functions $g_{0}$ and $g_{1}$ in (3) correspond to the supply and demand resulting from the flow in $G_{0}$ , and $\|g_{i}-f_{i}\|_{1}$ correspond to the flow between $w$ and $V_{i}$ .

Solving the min-cost flow in a graph has been well studied previously starting with the early work of D. R. Fulkerson in 1961 [20]. A polynomial time network simplex algorithm for minimum cost flow problems has been given in [19]. Table 1 shows the time advantage of using this method compared to directly solving (3) using a general purpose convex optimization tool like CVX.

3.2 Role of $\kappa$

The Monge-Kantorovich distance contains a free parameter $\kappa$ that specifies the penalty of adding and removing a unit of mass to the spectra. In the reformulation of $\tilde{T}_{c,\kappa}$ as the min-cost flow problem, the flow of the optimal solution in any edge with cost greater than $2\kappa$ is going to be 0. To see this, assume that $(\hat{u},\hat{v})\in E$ is an edge with $c(\hat{u},\hat{v})>2\kappa$ and $\varphi(\hat{u},\hat{v})>0$ . Then the flow $\hat{\varphi}$ given by

[TABLE]

is feasible and with lower cost:

[TABLE]

This contradicts that $\varphi$ is the minimum cost flow hence the support of $\varphi$ may be restricted to the edges of cost less or equal to $2\kappa$ .

This observation significantly reduces the number of edges in the graph and hence reduces the computations required for calculating the Monge-Kantorovich distance. The computational time of the network simplex algorithm is $O(K^{2}N^{2}\log(K))$ where $K$ is the number of nodes in the graph and $N$ is the number of edges [19]. Therefore, if the use of $\kappa$ reduces the number of edges with a factor $p$ , the computation time will be reduced by a factor of $p^{2}$ .

4 Automatic target recognition

We consider the problem of automatic target recognition of civilian vehicles and the goal is to analyze a SAR image of a parking lot and determine if a given car in the parking lot is a sedan, a SUV or a van. This recognition problem is solved by first identifying the vehicles and then using a classification method to determine which class the car belongs to. We compute the results both using the Monge-Kantorovich distance and the $\ell_{2}$ distance as distance for the classification method in order to compare recognition rates.

4.1 The data set

We use the Gotcha 2008 [21] data set where SAR images are taken by an airborne radar from a circular flight pattern111We use the GOTCHA volumetric SAR data in this example from the U.S. Air force Sensor Data Management System. This data is publicly available by request.. SAR imaging comes down to a 2D spectral estimation problem [22] and gives an image of reflections for a given look angle. These are solved using sparse imaging methods [11] and then fused together using standard SAR imaging techniques. This results in a data set containing images of 535 cars parked in a parking lot, 231 images of Sedans, 182 SUVs, 122 Vans. For the sake of an equal group size, 120 images are picked from each car type.

As a preprocessing step, the cars in the image are rotated so that they are aligned and cropped such that each image contains only the car image. The pose estimation method is described in [23]. To speed up the computation of the Monge-Kantorovich distance, images are scaled down to $58\times 48$ and $29\times 24$ pixels. The rescaling uses a bicubic interpolation where each pixel is replaced by a weighted average of pixels in the nearest $4\times 4$ neighborhood. This also allows us to study the robustness against image resolution.

4.2 Methodology

Next, the Monge-Kantorovich222The Euclidean distance $c(x,y)=\|x-y\|_{2},$ where $x,y\in\Omega$ , is used as underlying distance. and the $\ell_{2}$ distances are computed for all pairs of images. To solve the min-cost flow for the constructed graph in section 3.1 we used TOMLAB CPLEX [24]. The solver is generally considered the state-of-the-art large scale mixed integer linear and quadratic programming solver.

For the classification, a training set is selected at random consisting of one third of the images from each group. The rest of the images are used as test data. The test images are then classified using the nearest neighbour method, where each test image is associated with the class corresponding to the class of closest training image. The error rate is then computed as the number of mislabeled cars divided by the total number of cars. This process is repeated $1000$ times and the average error rates are depicted in Fig. 3.

5 Results

From the error rates in Fig. 3 it is seen that the recognition rates are considerably higher when using the Monge-Kantorovich distance compared to the $\ell_{2}$ distance provided that $\kappa$ is chosen appropriately. Also as long as $\kappa$ is in a reasonable range, the recognition rate is not considerably sensitive to its value and hence it can be considered as a semi-parametric method.

From Fig. 3 it can be seen that the optimal recovery rate for the Monge-Kantorovich based recognition is the same for the two image granularity levels (for correctly selected $\kappa$ ). This suggests that this distance is relatively insensitive to rescaling/smoothing of the image. Also note that when the image size is reduced, then the error rate for the $\ell_{2}$ norm is dropped by a considerable amount. The down-sampling method takes an average of the neighbouring pixels and use it as the new pixel value hence acts as smoothing. So when the $\ell_{2}$ norm is computed for the smaller image size, it is less sensitive to the pixel by pixel error and more sensitive to the the total spectral energy in a region.

6 Conclusions

In this paper we consider the optimal transport distance and its application for automatic target recognition. The results show that the error rate can be considerably lower when using the Monge-Kantorovich distance compared to the standard $\ell_{2}$ distance as underlying distance. We also present a fast way to compute the Monge-Kantorovich distance using the network simplex algorithm that applies also for spectra with different total mass.

Bibliography24

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] C. Villani, Topics in Optimal Transportation , vol. 58, Graduate studies in Mathematics, AMS, 2003.
2[2] J. De Gol and M. Nam, “A clustering approach for detecting moving objects captured by a moving aerial camera,” in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on , May 2014, pp. 6538–6542.
3[3] S. Haker, L. Zhu, A. Tannenbaum, and S. Angenent, “Optimal mass transport for registration and warping,” International Journal of Computer Vision , vol. 60, no. 3, pp. 225–240, 2004.
4[4] J.R. Hoffman and R.P.S. Mahler, “Multitarget miss distance via optimal assignment,” Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on , vol. 34, no. 3, pp. 327–336, May 2004.
5[5] X. Jiang, L. Ning, and T.T. Georgiou, “Distances and riemannian metrics for multivariate spectral densities,” IEEE Transactions on Automatic Control , vol. 57, no. 7, pp. 1723–1735, 2012.
6[6] M. Muskulus and S. Verduyn-Lunel, “Wasserstein distances in the analysis of time series and dynamical systems,” Physica D: Nonlinear Phenomena , vol. 240, no. 1, pp. 45–58, 2011.
7[7] L. Schmidt, C. Hegde, P. Indyk, J. Kane, Ligang Lu, and D. Hohl, “Automatic fault localization using the generalized earth mover’s distance,” in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on , May 2014, pp. 8134–8138.
8[8] T. Georgiou, J. Karlsson, and M.S. Takyar, “Metrics for power spectra: An axiomatic approach,” IEEE Transactions on Signal Processing , vol. 57, no. 3, pp. 859–867, March 2009.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Automatic Target Recognition Using

Abstract

1 Introduction

2 Background

3 Computation of the Monge-Kantorovich distance

3.1 Monge-Kantorovich Distance as a Network Simplex Problem

3.2 Role of κ\kappaκ

4 Automatic target recognition

4.1 The data set

4.2 Methodology

5 Results

6 Conclusions

3.2 Role of $\kappa$