Multiplex graph matching matched filters

Konstantinos Pantazis; Daniel L. Sussman; Youngser Park; Zhirui Li,; Carey E. Priebe; Vince Lyzinski

arXiv:1908.02572·cs.SI·December 6, 2021

Multiplex graph matching matched filters

Konstantinos Pantazis, Daniel L. Sussman, Youngser Park, Zhirui Li,, Carey E. Priebe, Vince Lyzinski

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multiplex graph matching approach that extends classical methods to handle multi-channel networks, improving template detection in complex background networks through theoretical and empirical validation.

Contribution

It develops a multiplex analogue of graph matching, enabling efficient template detection in multi-channel networks, which is a novel extension of previous single-channel methods.

Findings

01

The method effectively detects templates in multiplex networks.

02

Considering multiple channels enhances detection accuracy.

03

The approach is validated both theoretically and empirically.

Abstract

We consider the problem of detecting a noisy induced multiplex template network in a larger multiplex background network. Our approach, which extends the framework of Sussman et al. (2019) to the multiplex setting, leverages a multiplex analogue of the classical graph matching problem to use the template as a matched filter for efficiently searching the background for candidate template matches. The effectiveness of our approach is demonstrated both theoretically and empirically, with particular attention paid to the potential benefits of considering multiple channels.

Tables2

Table 1. Table 1: For each padding regime, we provide the % percent \% of template edges present in the recovered background signal in the best random restart. For example, the best recovered background signal in the Centered Padding regime recovered 86.67 % percent 86.67 86.67\% of the edges in template channel 1, and 85.07 % percent 85.07 85.07\% of the edges in template channel 2, and 96.77 % percent 96.77 96.77\% of the edges in template channel 3. Here, the best performer is the one that recovers the highest average % percent \% across the three channels (averaging the % percent \% within each channel across channels).

Padding regime	$%$ recovered in ch. 1	$%$ recovered in ch. 2	$%$ recovered in ch. 3
Centered	86.67	85.07	96.77
Naive	98.33	100	96.77

Table 2. Table 2: Performance on Template 1

Method	T1B	T1C	T1D	T1E	T1F
M-GMMF	0.24	6.28	4.35	5.17	3.12
G-Finder	7.38	18.81	0.76	2.25	14.96
Tu et al. [43]	0.347	8.783	15.47	19.194	13.596
Kopylov et al. [26]	1.17	3.47	2.58	3.85	2.94

Equations125

P \in Π_{n} arg min ∥ A P - P B ∥_{F} = P \in Π_{n} arg max trace (A P B^{T} P^{T}) .

P \in Π_{n} arg min ∥ A P - P B ∥_{F} = P \in Π_{n} arg max trace (A P B^{T} P^{T}) .

A_{i} (u, v) = ⎩ ⎨ ⎧ 100 if u,v \in V (H_{i}), and {u,v} \in E (H_{i}); if u,v \in V (H_{i}), and {u,v} \in / E (H_{i}); if u or v \in [m] ∖ V (H_{i});

A_{i} (u, v) = ⎩ ⎨ ⎧ 100 if u,v \in V (H_{i}), and {u,v} \in E (H_{i}); if u,v \in V (H_{i}), and {u,v} \in / E (H_{i}); if u or v \in [m] ∖ V (H_{i});

B_{i} (u, v) = ⎩ ⎨ ⎧ 100 if u,v \in V (G_{i}), and {u,v} \in E (G_{i}); if u,v \in V (G_{i}), and {u,v} \in / E (G_{i}); if u or v \in [n] ∖ V (G_{i});

B_{i} (u, v) = ⎩ ⎨ ⎧ 100 if u,v \in V (G_{i}), and {u,v} \in E (G_{i}); if u,v \in V (G_{i}), and {u,v} \in / E (G_{i}); if u or v \in [n] ∖ V (G_{i});

A_{i} (u, v)

A_{i} (u, v)

B_{i} (u, v)

P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} = P \in Π_{n} arg min i = 1 \sum c - tr ((A_{i} \oplus 0_{n - m}) P B_{i} P^{T}),

P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} = P \in Π_{n} arg min i = 1 \sum c - tr ((A_{i} \oplus 0_{n - m}) P B_{i} P^{T}),

P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} = P \in Π_{n} arg min i = 1 \sum c - tr ((A_{i} \oplus 0_{n - m}) P B_{i} P^{T}) .

P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} = P \in Π_{n} arg min i = 1 \sum c - tr ((A_{i} \oplus 0_{n - m}) P B_{i} P^{T}) .

P \in Π_{n} arg min i = 1 \sum c ∥ A_{i} P - P B_{i} ∥_{F}^{2} .

P \in Π_{n} arg min i = 1 \sum c ∥ A_{i} P - P B_{i} ∥_{F}^{2} .

P \in Π_{n} arg min i = 1 \sum c λ_{i} ∥ A_{i} P - P B_{i} ∥_{F}^{2},

P \in Π_{n} arg min i = 1 \sum c λ_{i} ∥ A_{i} P - P B_{i} ∥_{F}^{2},

A_{G, E} (i, j) = A (i, j) \cdot (1 - 2 X (i, j)),

A_{G, E} (i, j) = A (i, j) \cdot (1 - 2 X (i, j)),

Δ_{P} = {{i, j} \in (2 [ m ]) s.t. {i, j} \neq = {σ_{p} (i), σ_{p} (j)}},

Δ_{P} = {{i, j} \in (2 [ m ]) s.t. {i, j} \neq = {σ_{p} (i), σ_{p} (j)}},

Δ_{P}^{(i, 1)}

Δ_{P}^{(i, 1)}

Δ_{P}^{(i, 2)}

i \in [c] ∖ B \sum (2∣ Δ_{P}^{(i, 1)} ∣ + ∣ Δ_{P}^{(i, 2)} ∣) (1 - 2 s_{i}) (1 - 2 q_{i}) \geq k \frac{672 m ^{1 + α} c}{β} .

i \in [c] ∖ B \sum (2∣ Δ_{P}^{(i, 1)} ∣ + ∣ Δ_{P}^{(i, 2)} ∣) (1 - 2 s_{i}) (1 - 2 q_{i}) \geq k \frac{672 m ^{1 + α} c}{β} .

P (P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} \neq \subset P_{m, n}) \leq 2 n^{- 2},

P (P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} \neq \subset P_{m, n}) \leq 2 n^{- 2},

∣ Δ_{P}^{(i, 1)} ∣ \in (∣ Δ_{P} ∣ p_{i} (1 - p_{i}), 3∣ Δ_{P} ∣ p_{i} (1 - p_{i})),

∣ Δ_{P}^{(i, 1)} ∣ \in (∣ Δ_{P} ∣ p_{i} (1 - p_{i}), 3∣ Δ_{P} ∣ p_{i} (1 - p_{i})),

1 - 2 exp {\frac{- 2∣ Δ _{P} ∣ p _{i}^{2} ( 1 - p _{i} ) ^{2}}{8}} .

1 - 2 exp {\frac{- 2∣ Δ _{P} ∣ p _{i}^{2} ( 1 - p _{i} ) ^{2}}{8}} .

∣ Δ_{P}^{(i, 1)} ∣ \in (1/3, 3) \cdot mk p_{i} (1 - p_{i}) .

∣ Δ_{P}^{(i, 1)} ∣ \in (1/3, 3) \cdot mk p_{i} (1 - p_{i}) .

i \in [c] ∖ B \sum p_{i} (1 - 2 s_{i}) (1 - 2 q_{i}) \geq \frac{6048 m ^{α - 1} c}{β} .

i \in [c] ∖ B \sum p_{i} (1 - 2 s_{i}) (1 - 2 q_{i}) \geq \frac{6048 m ^{α - 1} c}{β} .

P

P

m p_{i}^{2} \geq ξ lo g n for all i \in [c],

m p_{i}^{2} \geq ξ lo g n for all i \in [c],

(1/2 - s) (1/2 - q) (c_{1} - c_{2}) > γ m^{α - 1} c,

(1/2 - s) (1/2 - q) c > γ m^{α - 1},

(1/2 - s) (1/2 - q) c > γ m^{α - 1},

E_{i}^{(1)} (j, ℓ)

E_{i}^{(1)} (j, ℓ)

E_{i}^{(2)} (j, ℓ)

Δ_{P}^{(1)}

Δ_{P}^{(1)}

Δ_{P}^{(2)}

∣ Δ_{P}^{(1)} ∣ i \sum 2 (1 - 2 s_{i}) (1 - r_{i} - t_{i}) + ∣ Δ_{P}^{(2)} ∣ i \sum 2 (1 - 2 q_{i}) (1 - r_{i} - t_{i}) \geq k \frac{672 m ^{1 + α} c}{β},

∣ Δ_{P}^{(1)} ∣ i \sum 2 (1 - 2 s_{i}) (1 - r_{i} - t_{i}) + ∣ Δ_{P}^{(2)} ∣ i \sum 2 (1 - 2 q_{i}) (1 - r_{i} - t_{i}) \geq k \frac{672 m ^{1 + α} c}{β},

P (P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} \neq \subset P_{m, n}) = 2 n^{- 2};

P (P \in Π_{n} arg min i = 1 \sum c ∥ (A_{i} \oplus 0_{n - m}) P - P B_{i} ∥_{F}^{2} \neq \subset P_{m, n}) = 2 n^{- 2};

∣ Δ_{P}^{(j)} ∣ \in (\frac{1}{2} ∣ Δ_{P} ∣ p (1 - p), \frac{3}{2} ∣ Δ_{P} ∣ p (1 - p)),

∣ Δ_{P}^{(j)} ∣ \in (\frac{1}{2} ∣ Δ_{P} ∣ p (1 - p), \frac{3}{2} ∣ Δ_{P} ∣ p (1 - p)),

1 - 2 exp {\frac{- 2∣ Δ _{P} ∣ p ^{2} ( 1 - p ) ^{2}}{32}} .

1 - 2 exp {\frac{- 2∣ Δ _{P} ∣ p ^{2} ( 1 - p ) ^{2}}{32}} .

∣ Δ_{P}^{(j)} ∣ \in (1/6, 3/2) \cdot mk p (1 - p) .

∣ Δ_{P}^{(j)} ∣ \in (1/6, 3/2) \cdot mk p (1 - p) .

p i = 1 \sum c (1 - s_{i} - q_{i}) (1 - r_{i} - t_{i}) \geq \frac{6048 m ^{α - 1} c}{β} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jataware/mgmmf
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Bayesian Modeling and Causal Inference · Complex Network Analysis Techniques

Full text

Multiplex graph matching matched filters

Konstantinos Pantazis

Department of Mathematics, University of Maryland, College Park

Daniel L. Sussman

Department of Mathematics and Statistics, Boston University

Youngser Park

Center for Imaging Sciences, Johns Hopkins University

Zhirui Li

Department of Mathematics, University of Maryland, College Park

Carey E. Priebe

Department of Applied Mathematics and Statistics, Johns Hopkins University

Vince Lyzinski

Department of Mathematics, University of Maryland, College Park

Abstract

We consider the problem of detecting a noisy induced multiplex template network in a larger multiplex background network. Our approach, which extends the framework of [41] to the multiplex setting, leverages a multiplex analogue of the classical graph matching problem to use the template as a matched filter for efficiently searching the background for candidate template matches. The effectiveness of our approach is demonstrated both theoretically and empirically, with particular attention paid to the potential benefits of considering multiple channels.

Keywords— Multiplex graphs, graph matching, correlation network models, matched filters

1 Introduction and Background

Multilayer and multiplex networks have proven to be useful models for capturing complex relational data where multiple types of relations are potentially present between vertices in the network [7, 24]. For example, in connectomes (i.e., brain graphs) different edge modalities can represent different synapse types between neurons [45]; in social networks different edge modalities can capture relationships in different social network platforms [21]; in scholarly networks different edge modalities can capture co-authorship across multiple classification categories [36]. Moreover, in many applications leveraging the signal across the different layers of the network can lead to better, more robust performance than working within any single network modality [35, 24, 8].

The inference task we consider here is the problem of detecting (possibly multiple copies of) a noisy induced subgraph in a multiplex background network (see Definition 1.1 for the definition of multiplex networks we consider herein). Succinctly, given a multiplex template ${\bf A}$ with $m$ vertices, we seek to find the “best fitting” subgraph(s) in a larger multiplex background network ${\bf B}$ (see Section 2 for detail) with $n\gg m$ vertices. This problem is a generalization of the NP-complete [40] multiplex subgraph isomorphism problem (see [25] for a definition of multiplex isomorphism), accounting for the reality that relatively large, complex subgraph templates may only errorfully occur in the larger background network. These errors may be due to missing edges/vertices in the template or background, and arise in a variety of real data settings [39]. The subgraph isomorphism problem—given a template $A$ , determine if an isomorphic copy of $A$ exists in a larger network $B$ and find the isomorphic copy (or copies) if it exists—has been the subject of voluminous research in the monoplex (i.e., single layer) setting, with approaches based on efficient tree search [44], color coding [3, 2], graph homomorphisms [20], rule-based/filter-based matchings [10, 33], among others; for a survey of the literature circa 2012, see [27]. In contrast, the problem of multilayer homomorphic/isomorphic subgraph detection is still in its relative infancy, with comparatively fewer existing methods in the literature; see, for example, [47, 42, 33].

Notation: The following notation will be used throughout. For an integer $n>0$ , we will define $[n]:=\{1,2,\ldots,n\}$ , $J_{n}$ to be the $n\times n$ hollow matrix with all off-diagonal entries identically set to $1$ , ${\bf 0}_{n}$ to be the $n\times n$ matrix with all entries identically set to [math].

1.1 (Multiplex) graph matching

The above noisy induced subgraph detection problem depends greatly on the definition of “best fitting” employed. Our approach, generalizing [41] to the multiplex setting, will employ the multiplex template ${\bf H}$ to search the multiplex background graph ${\bf G}$ for possible matches, with goodness of fit measured via a multiplex formulation of the classical graph matching problem (see [9, 19, 16, 46] for excellent reviews of the voluminous graph matching literature). In the monoplex setting, the simplest formulation of the graph matching problem (GMP) can be stated as follows: Given two $n$ -vertex, undirected graphs with respective (weighted) adjacency matrices $A$ and $B$ , find a permutation matrix $P\in\Pi_{n}=\{n\times n$ permutation matrices} in

[TABLE]

Before lifting the graph matching problem to the multiplex setting, we first need to define precisely what we mean by a multiplex graph.

1.1.1 Multiplex networks

The above formulation of both the GMP requires both graphs to identically have $n$ vertices, though there are myriad ways of adapting the GMP to graphs of different orders (see, for example, Appendix F of [6]). In the multiplex subgraph matching problem at the core of this paper, we view the template ${\bf A}$ as being equal or lower order than the background ${\bf B}$ . Moreover, our definition of multiplex networks, ideally, would allow for differing graph orders across the multiplex layers within a single graph. To allow for these expected data nuances in the multiplex setting, we consider the following multiplex graph model; see [24] for a thorough overview of this and other multiplex network formulations.

Definition 1.1.

The $c$ -tuple ${\bf G}=(G_{1},G_{2},\dots,G_{c})$ is an $n$ -vertex multiplex network if for each $i=1,2,\dots,c,$ we have that $G_{i}\in\mathcal{G}_{n_{i}}=\{n_{i}$ -vertex labeled graphs $\}$ , and the vertex sets $(V_{i}=V(G_{i}))_{i=1}^{c}$ further satisfy the following:

i.

For each $i\in[c]$ , we have that $V(G_{i})\subseteq[n]$ ; 2. ii.

$\bigcap\limits_{i=1}^{c}V(G_{i})\neq\emptyset$ * and $\bigcup\limits_{i=1}^{c}V(G_{i})=[n]$ ;* 3. iii.

The layers are a priori node aligned; i.e., vertices sharing the same label across layers correspond to the same entity in the network.

Note that each vertex $v\in[n]$ need not appear in each channel $i\in[c]$ , however, we do require that at least one vertex appears simultaneously in all channels. We will denote the set of $c$ -layer, $n$ -vertex multiplex networks via $\mathcal{M}_{n}^{c}$ .

1.1.2 Multiplex GMP

To lift the monoplex GMP to the general multiplex definition presented above, we consider the following padded formulations of our general multiplex networks (adapted here from [18, 41]). Letting ${\bf H}\in\mathcal{M}_{m}^{c}$ and ${\bf G}\in\mathcal{M}_{n}^{c}$ with $m\leq n$ , we consider the following two schemes for ameliorating the differing graph orders.

i.

(Naive Padding) For each $i\in[c]$ , define the weighted adjacency matrices $\widetilde{A}_{i}\in\mathbb{R}^{m\times m}$ and $\widetilde{B}_{i}\in\mathbb{R}^{n\times n}$ via

[TABLE]

Denote ${\bf\widetilde{A}}=(\widetilde{A}_{1},\widetilde{A}_{2},\cdots,\widetilde{A}_{c})$ and ${\bf\widetilde{B}}=(\widetilde{B}_{1},\widetilde{B}_{2},\cdots,\widetilde{B}_{c})$ . 2. ii.

(Centered Padding) For each $i\in[c]$ , define the weighted adjacency matrices $\widehat{A}_{i}\in\mathbb{R}^{m\times m}$ and $\widehat{B}_{i}\in\mathbb{R}^{n\times n}$ via

[TABLE]

Denote ${\bf\widehat{A}}=(\widehat{A}_{1},\widehat{A}_{2},\cdots,\widehat{A}_{c})$ and ${\bf\widehat{B}}=(\widehat{B}_{1},\widehat{B}_{2},\cdots,\widehat{B}_{c})$ .

The Naive Multiplex Graph Matching Problem (nMGMP) is then defined as finding an element $P\in\Pi_{n}$ in

[TABLE]

where $\textbf{0}_{n-m}$ is the $n-m\times n-m$ matrix of all [math]’s. The formulation in Eq. (2) effectively seeks to maximize the number of common edges between the multiplex template and multiplex background, where all edges across all channels are weighted equally (see [18, 6] for the monoplex analogue). The Centered Multiplex Graph Matching Problem (cMGMP) is defined as finding an element $P\in\Pi_{n}$ in

[TABLE]

If for each $i$ , we have that $V(G_{i})=[n]>[m_{i}]=V(H_{i})$ , then the formulation in Eq. (3) effectively seeks to minimize the number of disagreements (edge mapped to non-edge and vice versa) induced between the background and the matched subgraphs in the template, where all disagreements across all channels are weighted equally. Given this interpretation, the appropriate padding schemes to deploy in practice depends on the underlying problem assumptions and setting.

Remark 1.

Our formulation of the Multiplex GMP is (assuming channels of equal order across ${\bf A}$ and ${\bf B}$ )*

[TABLE]

rather than a formulation weighting the matching in each channel via

[TABLE]

*for $\lambda_{i}>0$ . In our subgraph detection setting, we have found that $\lambda_{i}=1$ works suitably well; moreover, this weights each edge in each template channel equally, which may be desirable. In the case that one or more channels is more informative or of higher import than the others, then choosing appropriate $\lambda$ ’s to overweight the matching in those channels may be desirable.

2 Multiplex Graph Matching Matched Filters

Given $\mathcal{A}$ , a multiplex graph matching algorithm designed to approximately solve Eqs. (2–3), our multiplex graph matching matched filter (M-GMMF), generalizing the monoplex filtering setting of [41], proceeds as in Algorithm 1. Note that in our experiments (and in the pseudocode below), we make use of $\mathcal{A}=\texttt{MFAQ}$ (see Algorithm 2 in Appendix A.1), but we stress that our approach can utilize any suitable $\mathcal{A}$ equally well.

Effectively, the M-GMMF algorithm uses the multiplex template (and algorithm $\mathcal{A}$ ) to search $\Pi_{n}$ for suitable solutions aligning ${\bf H}$ to ${\bf G}$ . The multiple restarts in Step 2. of the procedure are needed in the case of $\mathcal{A}=\texttt{M-FAQ}$ , as in that setting the objective function is relaxed to an indefinite quadratic program with myriad local minima in the feasible region; these restarts aim to precisely counteract the presence of these local minima by broadly searching the feasible region for a global minimum. For approximate combinatorial $\mathcal{A}$ , the restarts may be appropriate as well, while for continuous, convex relaxation algorithms (see, for example, [6]), this step may not be necessary.

Note that code implementing the above M-GMMF and M-FAQ procedures can be downloaded as part of our R package, iGraphMatch, which is available on CRAN or can be downloaded at https://github.com/dpmcsuss/iGraphMatch.

2.1 Multiplex Matchability

In [41], the authors considered an error model wherein the template ${\bf H}$ is an errorful induced subgraph of the background ${\bf G}$ in the monoplex setting. The aim of the Monoplex-GMMF approach then was to recover the vertices in ${\bf G}$ corresponding to ${\bf H}$ . Can we recover the analogous results in the multiplex setting? To frame and attack this problem statistically, we consider the following error model which we will use to generate a multiplex background graph ${\bf G}\in\mathcal{M}^{c}_{n}$ and a multiplex template ${\bf H}\in\mathcal{M}^{c}_{m}$ with $m<n$ .

Definition 2.1 (See [4]).

Consider a graph $G$ with $V(G)\subset[n]$ . Let the centered, padded adjacency matrix (as in Eq. (1)) of $G$ be denoted $\widehat{A}\in\mathbb{R}^{n\times n}$ . Let $E\in[0,1]^{n\times n}$ be a symmetric, hollow matrix. The graph-valued random variable $\mathcal{E}(G)$ with vertex set equal to $V(G)$ and random centered, padded adjacency matrix $\widehat{A}_{G,E}$ , which models passing $G$ through an errorful channel $E$ , is defined as follows. For each $\{i,j\}\in\binom{[n]}{2}$ ,

[TABLE]

where $X(i,j)\stackrel{{\scriptstyle ind.}}{{\sim}}Bern(E(i,j))$ .

The two generative models we then consider are defined via:

i.

(Single Channel Source, Error Multiplex, abbreviated ME) There is a single non-random background source graph $W\in\mathcal{G}_{n}$ and non-random source template $T=W[m]\in\mathcal{G}_{m}$ , and two multi-channel errorful filters ${\bf E}^{(1)}=(E^{(1)}_{1},\ldots,E^{(1)}_{c})$ , with each $E^{(1)}_{i}$ acting on $W$ , and ${\bf E}^{(2)}=(E^{(2)}_{1},\ldots,E^{(2)}_{c})$ , with each $E^{(2)}_{i}$ acting on $T$ . We observe ${\bf G}=(\mathcal{E}^{(1)}_{1}(W),\ldots,\mathcal{E}^{(1)}_{c}(W))$ as the multiplex background and ${\bf H}=(\mathcal{E}^{(2)}_{1}(T),\ldots,\mathcal{E}^{(2)}_{c}(T))$ as the multiplex template. By assumption, the errorful filters act independently across channels within ${\bf G}$ and ${\bf H}$ , and independently across ${\bf G}$ and ${\bf H}$ . In this model, by construction each $|V(H_{i})|=[m]$ and each $|V(G_{i})|=[n]$ . 2. ii.

(Single Channel Errors, Source Multiplex, abbreviated MS) The non-random background and non-random template source graphs are multiplex. To wit, let ${\bf T}\in\mathcal{M}^{c}_{m}$ and ${\bf W}\in\mathcal{M}^{c}_{n}$ satisfy the following: For each $i\in[c]$ , let $\widehat{\bf C}$ and $\widehat{\bf D}$ be the centered paddings of ${\bf T}$ and ${\bf W}$ respectively. We assume then that $\widehat{C}_{i}=\widehat{D}_{i}[m]$ (i.e., $\widehat{C}_{i}$ —the padded adjacency matrix of $T_{i}$ —is the $m\times m$ principal submatrix of $\widehat{D}_{i}$ —the padded adjacency matrix of $W_{i}$ ). There are two multi-channel errorful filters: ${\bf E}^{(1)}=(E^{(1)}_{1},\ldots,E^{(1)}_{c})$ and ${\bf E}^{(2)}=(E^{(2)}_{1},\ldots,E^{(2)}_{c})$ . For each $i\in[c]$ , $E^{(1)}_{i}\in\mathbb{R}^{n\times n}$ acts on $W_{i}$ , and $E^{(2)}_{i}\in\mathbb{R}^{m\times m}$ acts on $T_{i}$ . We observe ${\bf G}=(\mathcal{E}^{(1)}_{1}(W_{1}),\mathcal{E}^{(1)}_{2}(W_{2})\ldots,\mathcal{E}^{(1)}_{c}(W_{c}))$ as the multiplex background and ${\bf H}=(\mathcal{E}^{(2)}_{1}(T_{1}),\mathcal{E}^{(2)}_{2}(T_{2}),\ldots,\mathcal{E}^{(2)}_{c}(T_{c}))$ as the multiplex template. As above, the errorful filters act independently across channels within ${\bf G}$ and ${\bf H}$ , and independently across ${\bf G}$ and ${\bf H}$ . Note that if the template (resp., background) channels have non-identical vertex sets, then this will be preserved in the errorful template (resp., background).

It may be convenient to view $T$ and $W$ (resp., ${\bf T}$ and ${\bf W}$ ) as realizations from graph-valued random variables in the ME (resp., MS) model. In this case, we will assume the actions of the errorful filters on $T$ and $W$ (resp., ${\bf T}$ and ${\bf W}$ ) are also independent of the random $T$ and $W$ (resp., ${\bf T}$ and ${\bf W}$ ).

Considering the models above, in order for our M-GMMF approach to possibly recover the true errorful induced subgraph of ${\bf G}$ corresponding to ${\bf H}$ , we need for the global minimum of the Multiplex GMP to be in $\mathcal{P}_{m,n}:=\{I_{m}\oplus P:P\in\Pi_{n-m}\}$ . This is the multiplex analogue of graph matchability, i.e., uncovering conditions under which oracle graph matching will recover a latent vertex alignment. Here, that alignment is represented by ${\bf H}$ being an errorful version of ${\bf G}[m]$ ; see, for example, [14, 22, 38, 37, 30, 23, 12, 11, 5, 13] for a litany of graph matchability results in the monoplex setting.

2.2 MS model matchability

In this section, we will explore the benefit of considering multiplex versus monoplex networks when considering template matchability in the MS model. We note here that while the formal theory underlying the matchability results in the multiplex setting differs only slightly from the monoplex setting of [41], we stress that the end results demonstrate the utility of considering multiple channels.

In the MS model, let ${\bf T}\in\mathcal{M}_{m}^{c}$ and ${\bf W}\in\mathcal{M}_{n}^{c}$ be the respective template and background source graphs, with respective centered, padded adjacency matrices given respectively by $\widehat{\bf C}$ and $\widehat{\bf D}$ satisfying $\widehat{C}_{i}=\widehat{D}_{i}[m]$ for all $i\in[c]$ . Assume that the errorful filters satisfy for each $i\in[c]$ , $E_{i}^{(1)}=q_{i}J_{n}$ and $E_{i}^{(2)}=s_{i}J_{m}$ (where $s_{i}=s_{i}(n)$ and $q_{i}=q_{i}(n)$ are allowed to vary with $n$ ). If $c=1$ , and $s_{1}=q_{1}=1/2$ , then the observed background and template are effectively independent ER $(n,1/2)$ and ER $(m,1/2)$ networks, respectively. It is immediate then that the optimal permutation aligning the background to the template will almost surely not be in $\mathcal{P}_{m,n}$ .

Consider now $c>1$ . Let $\mathcal{B}=\{i\in[c]\text{ s.t. }s_{i}\text{ or }q_{i}=1/2\}$ ; these “bad” channels act to obfuscate the latent alignment between $\widehat{\bf C}$ and $\widehat{\bf D}$ by effectively whitening the signal present in the alignment within the channels. Suppose that there exist constants $\alpha\leq 1$ , $\beta>0$ , and $n_{0}\in\mathbb{Z}>0$ such that for all $n>n_{0}$ , $m=m(n)$ satisfies $m^{\alpha}>\beta\log n$ . For each $m$ , denote the set of permutations that permute exactly $k$ labels of $[m]$ by $\Pi_{n,m,k}$ , and for each $P\in\Pi_{n}$ (with associated permutation $\sigma_{p}$ ), define

[TABLE]

and for each $i\in[c]$ , define

[TABLE]

Suppose that there exists an $n_{1}>0$ such that for all $n>n_{1}$ , we have that for all $k\in[m=m(n)]$ and all $P\in\Pi_{n,m,k}$ ,

[TABLE]

Letting $\widehat{\bf A}$ and $\widehat{\bf B}$ be the padded, centered adjacency matrices of ${\bf H}$ and ${\bf G}$ respectively (the errorful ${\bf T}$ and ${\bf W}$ ), for $n>\mathfrak{n}=\max(n_{0},n_{1})$ we have that

[TABLE]

(see Appendix A.2 for proof of this bound).

Exploring Eq. (5) in the ER setting further, we consider the following setup. If $c=c(n)\leq n$ , and for each $i\in[c]$ , $W_{i}\sim ER(n,p_{i})$ with $p_{i}=p_{i}(n)\leq 1/2$ , then for each $P\in\Pi_{n,m,k}$ , a simple application of McDiarmid’s inequality (see Appendix 1) yields that

[TABLE]

with probability at least

[TABLE]

Note that if $m>6$ , then $mk/3\leq|\Delta_{P}|\leq mk$ , so that with probability at least Eq. (7),

[TABLE]

Suppose that $\alpha<1$ , and that there exists an $n_{2}>0$ such that for all $n>n_{2}$ , we have that for all $i\in[c]$ , $mp_{i}^{2}\geq 384\log n$ , and

[TABLE]

For $n>\max(n_{2},n_{0})$ , we then have

[TABLE]

For proof of Eq. (9), see Appendix A.4.

We have thus proven the following theorem:

Theorem 1.

With setup as above, suppose that $\alpha<1$ , and $p_{i}=p$ is a fixed constant that does not vary with $n$ . Further suppose that $s_{i}=s<1/2$ , and for $c_{1}$ channels $q_{i}=q<1/2$ , and for $c_{2}=c-c_{1}-|\mathcal{B}|$ channels $q_{i}=1-q>1/2$ (where $c_{1}=c_{1}(n)$ , $c_{2}=c_{2}(n)$ , $s=s(n)$ and $q=q(n)$ are allowed to vary with $n$ ). Then there exist constants $\gamma,\xi>0$ , and $n_{2}\in\mathbb{Z}>0$ such that if for all $n>n_{2}$ ,

[TABLE]

then for $n>\max(n_{0},n_{2})$ , $\mathbb{P}(\mathcal{A}_{n})\leq 4n^{-2}.$ If $s$ , $q$ , $c$ , $c_{1}$ and $c_{2}$ are fixed constants that do not vary with $n$ , we need only require $c_{1}>c_{2}$ rather than Condition (1).

2.2.1 Strength in numbers

Consider $c_{2}=|\mathcal{B}|=0$ in Theorem 1. Condition (1) then reduces to

[TABLE]

and large values (i.e., close to $1/2$ ) of $s$ and $q$ can be mitigated by choosing an appropriately large $c$ ; effectively, multiple channels can amplify the weak signal present in each individual channel.

We explore this further in the following experiment. We will look at two different cases specifically, when $m=n$ and when $m<n$ . First, considering $n=m=100$ (to mitigate possible effects of template order on matching accuracy), we let ${\bf G},{\bf H}\in\mathcal{M}^{c}_{100}$ for $c$ ranging over $\{1,2,\cdots,10\}$ . For each $i\in[c]$ , we have that $(G_{i},H_{i})\sim\mathrm{ER}(100,0.5,\rho)$ (so that $G_{i}$ and $H_{i}$ are marginally ER(100,0.5) and edges across graphs are independent except that for each $\{j,k\}\in\binom{[100]}{2}$ , we have that corr $(\mathds{1}\{\{j,k\}\in E(G_{i})\},\mathds{1}\{\{j,k\}\in E(H_{i})\})=\rho$ ).

Within this model, the channels are endowed with a natural vertex correspondence across $G_{i}$ and $H_{i}$ , namely the identity mapping. Note that in the $W_{i}\sim$ ER $(n,p_{i})$ MS model setting, we have that $\text{Cov}(\mathds{1}\{\{j,k\}\in E(G_{i})\},\mathds{1}\{\{j,k\}\in E(H_{i})\})=p_{i}(1-p_{i})(1-2s_{i})(1-2q_{i}),$ so that the correlation between edges in $G_{i}$ and $H_{i}$ can be made positive or negative with judiciously chosen $s_{i}$ and $q_{i}$ . Considering $\rho$ varying over $\{0.1,0.2,0.3,0.4,0.5\}$ , we match ${\bf G}$ and ${\bf H}$ using M-FAQ (Algorithm 2 using $s=10$ seeded vertices [18]). Results are plotted in Figure 1. In Figure 1, we plot the mean matching accuracy (i.e., the fraction of vertices whose latent alignment is recovered correctly) of M-FAQ versus $c$ , averaged over 2000 Monte Carlo replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In red (resp., olive, green, blue, purple) we plot the results for $\rho=0.1$ (resp., $\rho=0.2$ , $\rho=0.3$ , $\rho=0.4$ , $\rho=0.5$ ). From Figure 1, we see the expected relationship: in low correlation settings where M-FAQ is unable to align the monoplex graphs, this can often be overcome by considering $c>1$ . Indeed, in all cases, save $\rho=0.1$ , perfect matching is achieved using $c\geq 8$ channels.

Next, we look at the case when $m<n$ . In addition to examining the effect of multiple channels when weak signal is present across channels, we wish to compare the effect of different padding schemes (Naive vs Centered) in terms of the matching accuracy. We analyze the padding scheme’s effectiveness first, by varying the values of the correlation $\rho\in\{0.1,0.2,0.3,0.4,0.5\}$ while keeping $n,m$ constant (see Figure 2) and second, by varying the background size $n\in\{100,500,1000,2000\}$ while the template size $m$ and the correlation $\rho$ remain constant (see Figure 3). Using the Naive (resp. Centered) padding scheme, we let $({\bf\widetilde{G}},{\bf\widetilde{H}})\in\mathcal{M}^{c}_{n}$ (resp. $({\bf\widehat{G}},{\bf\widehat{H}})\in\mathcal{M}^{c}_{n}$ ) for $c$ ranging over $\{1,2,\cdots,10\}$ . Utilizing $s=10$ seeds, we match ${\bf\widetilde{G}}$ and ${\bf\widetilde{H}}$ (resp. ${\bf\widehat{G}}$ and ${\bf\widehat{H}}$ ) using M-FAQ (Algorithm 2). Results are plotted in Figures 2 and 3. As in Figure 1, we plot the mean matching accuracy (i.e., the fraction of vertices whose latent alignment is recovered correctly) of M-FAQ versus $c$ , averaged over 100 MC replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In Figure 2, in red (resp., olive, green, blue, purple) we plot the results for $\rho=0.1$ (resp., $\rho=0.2$ , $\rho=0.3$ , $\rho=0.4$ , $\rho=0.5$ ). In Figure 3, in red (resp., green, blue, purple) we plot the results for $n=100$ (resp., $n=500$ , $n=1000$ , $n=2000$ ).

All figures demonstrate that even though the M-FAQ algorithm is unable to align the monoplex graphs when $c=1$ , this can often be overcome by considering $c>1$ . Moreover, Figures 2 and 3, show that the Centered Padding scheme achieves better matching accuracy between channels than the Naive Padding scheme. Akin to Figure 1, Figure 2 illustrates that the matching accuracy increases as the correlation increases. Finally, in Figure 3, we observe that the matching accuracy decreases as the ratio between the template size and the background size decreases.

2.2.2 The good outweighs the bad

In this section, we explore the ability of the signal in “good” channels to overcome the obfuscating effect of “bad” channels. To wit, consider Condition (1) with $c_{2}>0$ . We see that if there are enough channels (i.e., $c_{1}$ is sufficiently large) with positive correlation ( $s_{i},q_{i}<1/2$ ), then the template and background remain matchable even in the presence of (potentially) multiple anti-correlated channels.

We explore this further in the following experiment. As in the previous subsection 2.2.1, we study this “obfuscating” effect for both $m=n$ and $m<n$ cases. Again consider $n=m=100$ , and let ${\bf G},{\bf H}\in\mathcal{M}^{10}_{100}$ (i.e., $c=10$ ), where for $i\in[10]$ we have that $(G_{i},H_{i})\sim\mathrm{ER}(100,0.5,\rho)$ . Under the same setting, we let $n=500$ and we apply Naive Padding in $({\bf G},{\bf H})\in(\mathcal{M}^{10}_{500},\mathcal{M}^{10}_{100})$ , so that $(\widetilde{G}_{i},\widetilde{H}_{i})\sim\mathrm{ER}(500,0.5,\rho)$ for all $i\in[10]$ .

Considering $\rho$ to be either $\rho=r$ (for $c_{g}$ channels) or $\rho=-r$ (for $c_{b}=c-c_{g}$ channels), where $r$ varies in $\{0.1,0.2,0.3,0.4,0.5\}$ , we plot the matching accuracy (averaged over 2000 (left panel) and 100 (right panel) Monte Carlo replicates) obtained by M-FAQ (with 10 seeds) versus $c_{b}$ in Figure 4. For each choice of parameters, we also plot (via the partially transparent points) the accuracy achieved by each MC replicate. In red (resp., olive, green, blue, purple) we plot the results for $r=0.1$ (resp., $r=0.2$ , $r=0.3$ , $r=0.4$ , $r=0.5$ ). From the figure, we see the expected relationship: matching at higher levels of $\rho$ yields better accuracy, and more robustness to channels with negative correlation. Further, we notice that the matching accuracy in the right panel (i.e., $m<n$ ) is not as good as in the left panel (i.e., $m=n$ ). We make this phenomenon more clear in Figure 6.

In addition, we study the effect of different padding schemes (Naive vs Centered) in terms of the matching accuracy. We analyze the padding scheme’s effectiveness first, by varying the values of the correlation $r\in\{0.3,0.4,0.5,0.6,0.7\}$ while keeping $n,m$ constant (see Figure 5) and second, by varying the number of the background vertices $n\in\{100,500,1000,2000\}$ while the template size $m$ remains the same (see Figure 6). Using the Naive (resp. Centered) padding scheme, we let $({\bf\widetilde{G}},{\bf\widetilde{H}})\in\mathcal{M}^{c}_{n}$ (resp. $({\bf\widehat{G}},{\bf\widehat{H}})\in\mathcal{M}^{c}_{n}$ ) for $c$ ranging over $\{1,2,\cdots,10\}$ . Utilizing $s=10$ seeds, we match ${\bf\widetilde{G}}$ and ${\bf\widetilde{H}}$ (resp. ${\bf\widehat{G}}$ and ${\bf\widehat{H}}$ ) using M-FAQ (Algorithm 2). Results are plotted in Figures 5 and 6. As in Figure 4, we plot the mean matching accuracy (i.e., the fraction of vertices whose latent alignment is recovered correctly) of M-FAQ versus $c_{b}$ , averaged over 100 MC replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In Figure 5, in red (resp., olive, green, blue, purple) we plot the results for $r=0.3$ (resp., $r=0.4$ , $r=0.5$ , $r=0.6$ , $r=0.7$ ). In Figure 6, in red (resp., green, blue, purple) we plot the results for $n=100$ (resp., $n=500$ , $n=1000$ , $n=2000$ ).

From Figures 4 and 5, we observe that matching at higher levels of $\rho$ yields better accuracy, and more robustness to channels with negative correlation. Moreover, Figures 5 and 6 show that the Centered Padding scheme achieves better performance in terms of matching accuracy than the Naive Padding scheme.

2.3 ME model matchability

To derive analogous results to those in Section 2.2 in the ME model, we consider the following setting. Letting $W\in\mathcal{G}_{n}$ and $T=W[m]$ be the respective background and template source graphs, we again assume that there exist constants $\alpha\leq 1,$ $\beta>0$ , and $n_{0}\in\mathbb{Z}>0$ such that for all $n>n_{0}$ , $m=m(n)$ satisfies $m^{\alpha}\geq\beta\log n$ . Further assume that for each $i\in[c=c(n)]$ the errorful filters satisfy,

[TABLE]

For each $P\in\Pi(n)$ , define

[TABLE]

where $\Delta_{P}$ is defined as in Eq. (4). Suppose that there exists an $n_{1}>0$ such that for all $n>n_{1}$ , we have that for all $k\in[m=m(n)]$ and all $P\in\Pi_{n,m,k}$

[TABLE]

then

[TABLE]

where the bound in Eq. (13) uses Appendix A.5 and then follows mutatis mutandis from the proof in Appendix A.2.

Exploring this further in the ER setting, consider $W\sim ER(n,p=p(n))$ with $p\leq 1/2$ . As in Eq. (7), for each $j=1,2$ , we then have that

[TABLE]

with probability at least

[TABLE]

Note that if $m>6$ , then $mk/3\leq|\Delta_{P}|\leq mk$ , so that with probability at least Eq. (14),

[TABLE]

Suppose that $\alpha<1$ , and that there exists an $n_{2}>0$ such that for all $n>n_{2}$ , we have $mp^{2}\geq 1344\log n$ , and

[TABLE]

Then for $n>\max(n_{0},n_{2})$ , $\mathbb{P}(\mathcal{A}_{n})\leq 6n^{-2}$ . We have the following theorem (whose proof follows mutatis mutandis to that of Theorem 1 and so is omitted):

Theorem 2.

With setup as above, suppose that $\alpha<1$ . For

[TABLE]

where $e_{1}=e_{1}(n)$ and $e_{2}=e_{2}(n)$ can vary in $n$ and $c=c_{1}+c_{2}+c_{3}+c_{4}$ . Then there exist constants $\gamma,\xi>0$ , and $n_{2}\in\mathbb{Z}>0$ such that if for all $n>n_{2}$

[TABLE]

then for $n>\max(n_{0},n_{2})$ , $\mathbb{P}(\mathcal{A}_{n})\leq 6n^{-2}.$ If $e_{1}$ , $e_{2}$ , and $c$ are fixed in $n$ , we need only require $c_{1}+c_{2}<c_{3}+c_{4}$ for $\mathbb{P}(\mathcal{A}_{n})\leq 6n^{-2}$ to hold for sufficiently large $n$ .

3 Experiments

Our previous simulation explored the effect on multiple channels on multiplex matchability. We next consider the performance of our multiplex matched filter approach in detecting a hidden template in a multilayer social media network from [31]. The background network contains $3$ aligned channels representing user activity in FriendFeed, Twitter and Youtube (where the Youtube and Twitter channels were generated via FriendFeed which aggregates user information across these platforms). In total, there are $6,407$ unique vertices across the three channels, with the channel specific networks satisfying:

[TABLE]

Given a 35 vertex multiplex template ${\bf H}$ created by Pacific Northwest National Laboratories for the DARPA MAA program, we ran our M-GMMF algorithm (Algorithm 1) to attempt to recover the template in ${\bf G}$ ; results are summarized below.

In our first experiment, we first considered running “cold-start” M-GMMF; that is, no prior information (in the form of seeds, hard or soft) is utilized in the algorithm. We consider padding the graph via the Naive Padding and Centered Padding regimes of Section 1.1.2, and for each padding regime, we ran M-GMMF with $N=100$ random restarts. Numeric results are summarized in Table 1 (with the best recovered background signals also plotted in Figure 7). While the best recovered signal in the Naive Padding regime captures all but two template edges, this is at the expense of many extraneous background edges that do not appear in the template. On the other hand, the Centered Padding regime recovers most of the template edges (across the three channels) with minimal extra template edges in the recovered signal.

The M-FAQ algorithmic primitive (Algorithm 2) used in our implementation of M-GMMF is most effective when it can leverage a priori available matching data in the form of seeded vertices. Seeds can either come in the form of hard seeds (a priori known 1–to–1 matches; here that would translate to template vertices whose exact match is known in the background) or soft seeds (where a soft seeded vertex $v$ in ${\bf H}$ has an a priori known distribution over possible matches in ${\bf G}$ ; here this would translate into template vertices with a list of candidate matches in the background). While hard seeds are costly and often unavailable in practice, there are many scalable procedures in the literature for automatically generating soft seed matches. Here, we use as a soft-seeding the output of [33, 34], a filtering approach for finding all subgraphs of the background network homomorphic to the template.

For each node in the template, the output of [33, 34] produces a multiset of candidate matches in the background, where each candidate match corresponds to a template copy contained in the background as a subgraph (not necessarily as an induced subgraph). We convert the candidate matches into probabilities by simply converting the multiset to a count vector and normalizing the count vector to sum to $1$ . We then consider the normalized count vectors as rows of a stochastic matrix; this stochastic matrix provides M-FAQ with a soft-seeding which can be used to initialize the algorithm.

Considering random restarts as perturbations (akin to Step 2 of Algorithm 1) of the soft-seeding (conditioned on retaining nonnegative entries), we ran M-GMMF using a generalization of the Centered Padding regime, which is defined as follows: For each $i\in[c]$ , define the weighted adjacency matrices $\breve{A}_{i}\in\mathbb{R}^{m\times m}$ and $\breve{B}_{i}\in\mathbb{R}^{n\times n}$ via

[TABLE]

where we vary $w$ from [math] to $1$ . Note that $w=0$ yields Naive Padding, and $w=1$ yields Centered Padding. Optimal performance in the present experiment was achieved with $w=0.25$ , in which case $N=4000$ random restarts yielded an induced subgraph in the background that was isomorphic to the template network.

3.1 M-GMMF on Semantic Property Graphs

For our second example, we consider the semantic property graph released by Pacific Northwest National Laboratories as part of the MAA-AIDA Data Release V2.1.2 via the DARPA MAA program [15]. In this dataset, the background network is a knowledge graph constructed from a variety of documents (e.g., newspaper articles) by DARPA’s AIDA program. At a high level, the graph is encoding the real-world relationships between a variety of entities (people, locations, major events, etc.) that can be automatically extracted from a variety of data sources. Practically, the graph is a richly featured network on the order of 100K nodes. Node properties include name; rdf:type (corresponding to a structured ontology of types); textValue; linkTarget; start time; among others. Edge properties include name/id, rdf:type, argument (values given to edges of a given rdf:type), among others. Note that many nodes and edges do not have values for all properties. The templates here, themselves richly featured knowledge graphs, are of the order of 10s of vertices (ranging in size from 33 nodes/40 edges to 11 nodes/11 edges); for each of three template types, there are 6 variants with varying error levels, including one variant (version “A”) that is perfectly/isomorphically embedded into the background.

The principle challenge in applying our M-GMMF methodology on such richly featured data is sensibly incorporating the rich, structured features into our multiplex network framework. Towards this end, we adopted the following approach. We incorporated the vertex features/properties into a penalty term in the objective function, encoding the features into a vertex–to–vertex similarity matrix $S$ (this is possible provided that similarities are easily computed within each vertex covariate, which is the case here). Edge features were used to divide the knowledge graph into multiple overlapping channels in a multiplex network. One channel was assigned to each unique (E(rdf:type), E(argument)) pair in the template, and we divided background edges amongst the channels via

i.

A hard split based on E(argument): Within each (E(rdf:type), E(argument)) channel, only edges with matching E(argument) are potentially present. If the E(argument) property is missing in the background, we allow the edge to possibly exist in all (E(rdf:type), E(argument)) channels.

ii.

A soft split (weighted according to E(rdf:type) similarity; note that a rdf:type similarity function was provided with the data) based on E(rdf:type): Within each (E(rdf:type), E(argument)) channel, each background edge with matching (or missing) E(argument) is present, and the edge is weighted according to a similarity measured between its E(rdf:type) and that of the channel. Scalability gains can be achieved by thresholding the similarities to improve sparsity.

Spatiotemporal constraints can be coded into separate channels in the multiplex graph, one channel per spatiotemporal constraint in the template. Each constraint (e.g., action A must occur between x and y days after action B) yields an edge filter, with only edges that could potentially satisfy the constraint being added to that constraint channel. Lastly, numeric edge features are used to further weight the edges in the background/template. The final objective function we used in our M-GMMF approach was then of the form

[TABLE]

Performance of M-GMMF and various other approaches (as scored by the GED scoring metric of [15]) are presented in Figure 8 (Note that the filtering approach of Tu et al., as presented in [43], also present a method that uses the clean “A” template for training, and achieves essentially the best score for all template versions; we did not include those scores for comparison in our figure).

Overall, 18 templates were present: 3 distinct template types, each with 6 variants encompassing different amounts of template noise which give an indication of the knowledge graph structure encoded into the multiplex graph; see Figure 9, for a pair of example templates and M-GMMF recovered signals here. One template of each type was constructed to have a perfect isomorphic match in the background, while the noisy templates were designed for inexact/fuzzy matching. All approaches identified the isomorphic match in the background for all three template types version “A”, while our approach achieved its best relative results on the larger, more complex template (Template 1). For Template 1, we obtained the best (or nearly the best) score on 3/6 versions; see Table 2 for detail.

Templates 2 and 3 were essentially tree-like structures (nodes/edges is 13/15 and 11/11 respectively), and we suspect that the filtering-based approaches are more suitable to this problem type. Indeed, M-GMMF is designed for larger/more complex templates, though our performance (especially compared to the non-filtering G-Finder of [28]) is encouraging on these instantiations, especially on version “B” of the templates.

Note also that our approach does not directly seek to optimize the GED of [15] and here does not make use of the importance weights provided by the GED scoring metric (though these could easily be encoded via edge weights and scaling the similarities in $S$ ); rather, we seek to optimize the multiplex GM objective function of 18. Nonetheless, as shown in Figure 10, the rankings of the random restart outputs for our GM objective function and for the GED are often highly correlated, with recovered signals scoring well in one metric often scoring well in both. The interesting gap appearing in the plot of Template 1E is evocative of the GM phase transitions appearing in the literature (see, for example, [17]) and bears further study.

4 Discussion

In this paper, we presented a framework for finding noisy copies of a multiplex template in a large multiplex background network. Our strategy, which extends [41] to the multiplex setting, uses graph matching combined with multiple random restarts to search the background for locally optimal matches to the template. To formalize this strategy, we provided a very natural extension of the classical graph matching problem to the multiplex that is easily amended to matching graphs of different orders (both across networks and channels). Further, the effectiveness of the resulting algorithm, named M-GMMF, is demonstrated both theoretically and empirically.

There are a number of extensions and open questions that arose during the course of this work. Natural theoretic extensions include lifting Theorems 1 and 2 to non-edge independent models (note that certain localized dependencies amongst edges can easily be handled in the McDiarmind proof framework, while globally dependent errors provide a more significant challenge); formulating the analogues of Theorems 1 and 2 in the weighted, attributed graph settings; and considering the theoretic properties of various continuous relaxations of the multiplex GM problem akin to [1, 29, 6].

A key methodological questions in multiplex graph matching was touched upon in Remark 1; indeed, we expect the question of how to weight the matching across channels to be essential when applying these methods to topologically diverse and weighted networks. If the order of magnitude of edge weights vary across channel, then it is easy to see a GM algorithm aligning channels with large edge weights at the expense of the alignment accuracy in other channels. Judiciously choosing $(\lambda_{i})$ would allow for the signal in channels with smaller edge weights to be better leveraged towards a better overall matching.

While the largest network we consider in this work has $\approx 100,000$ vertices, scaling this approach to very large networks is essential. By utilizing efficient data structures for sparse, low-rank matrices and a clever implementation of the LAP subroutine of M-FAQ (step iii. in Algorithm 2), we are able to match $O(10)$ vertex templates to $20K$ -vertex background graphs in $<10s$ per restart with our base M-GMMF code (available in iGraphMatch) implemented in R on a standard laptop. Further work to scale M-GMMF by leveraging both efficient data structures and scalable approximate LAP solvers is currently underway.

List of abbreviations

GM(P) Graph Matching (Problem) nMGMP Naive Multiplex Graph Matching Problem cMGMP Centered Multiplex Graph Matching Problem M-GMMF Multiplex Graph Matching Matched Filters M-FAQ Multiplex Fast Approximate Quadratic ME (Single Channel Source) Error Multiplex MS (Single Channel Errors) Source Multiplex GED Graph Edit Distance LAP Linear Assignment Program DARPA Defense Advanced Research Projects Agency MAA Modeling Adversarial Activity AIDA Active Interpretation of Disparate Alternatives

Declarations

Availability of data and materials

The background graphs for the 3-channel social network in Section 3 is available at http://multilayer.it.uu.se/datasets.html. The data from the DARPA MAA analysis Section is not publicly available, and the obtained results for the outside algorithms appear in the cited papers in the literature.

The code implementing the M-GMMF and M-FAQ procedures can be downloaded as part of our R package, iGraphMatch, which is available on CRAN or can be downloaded at https://github.com/dpmcsuss/iGraphMatch.

Competing interests

Not applicable.

Funding

Dr. Sussman’s contribution to this work was partially supported by a grant from MIT Lincoln Labs and the Department of Defense. This material is based on research sponsored by the Air Force Research Laboratory and DARPA under agreement numbers FA8750-18-2-0035 and FA8750-20-2-1001. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Research Laboratory and DARPA or the U.S. Government.

Authors’ contributions

Daniel L. Sussman, Carey E. Priebe and Vince Lyzinski, and Konstantinos Pantazis conceived of the method and developed the theory. Konstantinos Pantazis, Youngser Park, Daniel L. Sussman, Vince Lyzinski and Zhirui Li worked on writing and implementing the algorithm and performing relevant experiments. Konstantinos Pantazis, Daniel L. Sussman and Vince Lyzinski wrote and curated the manuscript.

Acknowledgments

Not applicable.

Appendix A Appendix

Herein we collect details of our auxiliary algorithms and proofs of our main results.

A.1 Multiplex FAQ

The details of the M-FAQ algorithm are presented below.

A.2 Proof of Eq. (6)

For each $P\in\Pi_{n}$ define

[TABLE]

Assuming that $P\in\Pi_{n,m,k}$ , then $|\Delta_{P}|\leq mk$ . Note that $X_{P}$ is a function of (at most) $3c|\Delta_{P}|$ independent Bernoulli random variables, and changing any one of these Bernoulli random variables can change the value of $X_{P}$ by at most $8$ . McDiarmid’s inequality [32] then implies that for any $t\geq 0$ ,

[TABLE]

Note that if $\widehat{C}_{i}(j,k),\widehat{D}_{i}(j,k),\widehat{D}_{i}(\sigma_{p}(j),\sigma_{p}(k))\in\{1,-1\}$ then

[TABLE]

Define

[TABLE]

so that $|\Delta_{P}^{(i,0)}|=|\Delta_{P}^{(i,1)}|+|\Delta_{P}^{(i,2)}|+|\Delta_{P}^{(i,3)}|$ . We then have

[TABLE]

Note that if $P,Q\in\Pi_{n,m,k}$ , then $X_{P}=X_{Q}$ if $\sigma_{p}(j)=\sigma_{q}(j)$ for all $j\in[m]$ ; i.e., if there exists a $U\in\mathcal{P}_{m,n}$ such that $PU=Q$ . Note that this defines an equivalence relation on $P,Q\in\Pi_{n,m,k}$ which we will denote by “ $\sim$ ,” and let $\Pi_{n,m,k}^{*}$ be a fixed (but arbitrarily chosen) set composed of one member of each equivalence class according to “ $\sim$ .” Note that $|\Pi_{n,m,k}^{*}|$ is at most $m^{2k}n^{2k}$ . Letting $t=\mathbb{E}(X_{P})$ in Eq. (20), we have that if $n>\mathfrak{n}=\max(n_{0},n_{1})$

[TABLE]

as desired.

A.3 Proof of Eq. (7)

We have that if each $W_{i}\sim$ ER $(n,p_{i})$ , then

[TABLE]

so that $\mathbb{E}(\Delta_{P}^{(i,1)})=2p_{i}(1-p_{i})|\Delta_{P}|$ . Also, $\Delta_{P}^{(i,1)}$ is then a function of at most $2|\Delta_{P}|$ independent Bernoulli random variables, and changing the value of any one these can change the value of $\Delta_{P}^{(i,1)}$ by at most $2$ . McDiarmid’s inequality then yields the desired result

[TABLE]

by setting $t=p_{i}(1-p_{i})|\Delta_{P}|.$

A.4 Proof details for Eq. 9

Let the equivalence relation “ $\sim$ ” on $\Pi_{n,m,k}$ be defined via $P\sim Q$ if there exists a $U\in\mathcal{P}_{n,m}$ such that $PU=Q$ . Note that if $P\sim Q$ then

[TABLE]

Let $\Pi_{n,m,k}^{*}$ be a fixed (but arbitrarily chosen) set composed of one member of each equivalence class according to “ $\sim$ ,” and note that $|\Pi_{n,m,k}^{*}|$ is at most $m^{2k}n^{2k}$ . Given the assumptions in Section 2.2, for $n>n_{2}$ we have that for each $P\in\Pi_{n,m,k}^{*}$ ,

[TABLE]

Denote the event bound in Eq. (A.4) via $\mathcal{E}_{n,P}$ .

For $n>\max(n_{2},n_{0})$ , we then have

[TABLE]

as desired.

A.5 Proof details for Eq. (13)

For $P\in\Pi_{n}$ , define

[TABLE]

For $P\in\Pi_{n,m,k}$ , we then have that $X_{P}$ defined in Eq. (19) satisfies

[TABLE]

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Y. Aflalo, A. Bronstein, and R. Kimmel. On convex relaxation of graph isomorphism. Proceedings of the National Academy of Sciences , 112(10):2942–2947, 2015.
2[2] N. Alon, P. Dao, I. Hajirasouliha, F. Hormozdiari, and S. C. Sahinalp. Biomolecular network motif counting and discovery by color coding. Bioinformatics , 24(13):i 241–i 249, 2008.
3[3] N. Alon, R. Yuster, and U. Zwick. Color-coding. Journal of the ACM (JACM) , 42(4):844–856, 1995.
4[4] J. Arroyo, D. L. Sussman, C. E. Priebe, and V. Lyzinski. Maximum likelihood estimation and graph matching in errorfully observed networks. ar Xiv preprint ar Xiv:1812.10519 , 2018.
5[5] B. Barak, C. Chou, Z. Lei, T. Schramm, and Y. Sheng. (nearly) efficient algorithms for the graph matching problem on correlated random graphs. ar Xiv preprint ar Xiv:1805.02349 , 2018.
6[6] J. Bento and S. Ioannidis. A family of tractable graph distances. ar Xiv preprint ar Xiv:1801.04301 , 2018.
7[7] S. Boccaletti, G. Bianconi, R. Criado, C. I. Del Genio, J. Gómez-Gardenes, M. Romance, I. Sendina-Nadal, Z. Wang, and M. Zanin. The structure and dynamics of multilayer networks. Physics Reports , 544(1):1–122, 2014.
8[8] L. Chen, J. T. Vogelstein, V. Lyzinski, and C. E. Priebe. A joint graph inference case study: The C. elegans chemical and electrical connectomes. Worm , 5(2), 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Taxonomy

Multiplex graph matching matched filters

Abstract

1 Introduction and Background

1.1 (Multiplex) graph matching

1.1.1 Multiplex networks

Definition 1.1**.**

1.1.2 Multiplex GMP

Remark 1**.**

2 Multiplex Graph Matching Matched Filters

2.1 Multiplex Matchability

Definition 2.1** (See [4]).**

2.2 MS model matchability

Theorem 1**.**

2.2.1 Strength in numbers

2.2.2 The good outweighs the bad

2.3 ME model matchability

Theorem 2**.**

3 Experiments

3.1 M-GMMF on Semantic Property Graphs

4 Discussion

List of abbreviations

Declarations

Availability of data and materials

Competing interests

Funding

Authors’ contributions

Acknowledgments

Appendix A Appendix

A.1 Multiplex FAQ

A.2 Proof of Eq. (6)

A.3 Proof of Eq. (7)

A.4 Proof details for Eq. 9

A.5 Proof details for Eq. (13)

Definition 1.1.

Remark 1.

Definition 2.1 (See [4]).

Theorem 1.

Theorem 2.