Computational Approaches for Zero Forcing and Related Problems

Boris Brimkov; Caleb C. Fast; Illya V. Hicks

arXiv:1704.02065·cs.DM·September 20, 2018

Computational Approaches for Zero Forcing and Related Problems

Boris Brimkov, Caleb C. Fast, Illya V. Hicks

PDF

TL;DR

This paper introduces new computational methods combining integer programming and combinatorial algorithms to solve zero forcing problems and their variants more efficiently than existing approaches.

Contribution

It presents the first general-purpose algorithms for connected zero forcing and controlling forcing timesteps, outperforming brute force methods.

Findings

01

Algorithms are competitive with state-of-the-art zero forcing methods.

02

Proposed methods outperform brute force approaches.

03

New formulations for zero forcing as dynamic and set-covering problems.

Abstract

In this paper, we propose computational approaches for the zero forcing problem, the connected zero forcing problem, and the problem of forcing a graph within a specified number of timesteps. Our approaches are based on a combination of integer programming models and combinatorial algorithms, and include formulations for zero forcing as a dynamic process, and as a set-covering problem. We explore several solution strategies for these models, test them on various types of graphs, and show that they are competitive with the state-of-the-art algorithm for zero forcing. Our proposed algorithms for connected zero forcing and for controlling the number of zero forcing timesteps are the first general-purpose computational methods for these problems, and are superior to brute force computation.

Equations37

∣ Z^{*} \cap (S \cup N [u]) ∣ = ∣ S ∣ + ∣ N [u] \ (c l (S) \cup {w}) ∣ = ∣ S ∣ + ∣ N [u] \ c l (S) ∣ - 1;

∣ Z^{*} \cap (S \cup N [u]) ∣ = ∣ S ∣ + ∣ N [u] \ (c l (S) \cup {w}) ∣ = ∣ S ∣ + ∣ N [u] \ c l (S) ∣ - 1;

∣ Z^{*} \cap (S \cup N [u]) ∣ > ∣ S ∣.

∣ Z^{*} \cap (S \cup N [u]) ∣ > ∣ S ∣.

r^{'}

r^{'}

∣ (Z^{*} \ (c l (S \cup N [u]))) \cup Q ∣

∣ (Z^{*} \ (c l (S \cup N [u]))) \cup Q ∣

∣ C ∣ \leq i = 1 \sum s (i n),

∣ C ∣ \leq i = 1 \sum s (i n),

min

min

s_{v} + e \in δ^{-} (v) \sum y_{e} = 1 \forall v \in V

min

min

v \in B \sum s_{v} \geq 1 \forall B \in B

min

min

v \in V \sum x_{v} \geq 1

∄ A \subset B s.t. A is a fort,

∄ A \subset B s.t. A is a fort,

\forall v \in / B, \exists w \in B s.t. if A \subset B \cup {v} is a fort and v \in A, then w \in A .

min

min

v \in V \ F \sum x_{v} = 1

min

min

\displaystyle\sum_{\mathclap{v\in B}}\big{(}s_{v}+\sum_{\mathclap{v\in cl(N[w])}}z_{w}\big{)}\geq 1\qquad\forall B\in\mathcal{B}

min

min

v \in V \sum x_{v} \geq 1

min

min

v \in B \sum s_{v} \geq 1 \forall B \in B

min

min

v \in B \sum s_{v} \geq 1 \forall B \in B

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Computational Approaches for Zero Forcing

and Related Problems

Boris Brimkov

[email protected]

Caleb C. Fast

Department of Computational and Applied Mathematics, Rice University, 6100 Main St. - MS-134, Houston, Texas 77005

[email protected]

Illya V. Hicks

[email protected]

Abstract

In this paper, we propose computational approaches for the zero forcing problem, the connected zero forcing problem, and the problem of forcing a graph within a specified number of timesteps. Our approaches are based on a combination of integer programming models and combinatorial algorithms, and include formulations for zero forcing as a dynamic process, and as a set-covering problem. We explore several solution strategies for these models, test them on various types of graphs, and show that they are competitive with the state-of-the-art algorithm for zero forcing. Our proposed algorithms for connected zero forcing and for controlling the number of zero forcing timesteps are the first general-purpose computational methods for these problems, and are superior to brute force computation.

keywords:

Combinatorial optimization , zero forcing, integer programming, set-covering

††journal: European Journal of Operational Research

1 Introduction

Zero forcing is an iterative graph coloring process where at each discrete timestep, a colored vertex with a single uncolored neighbor forces that neighbor to become colored. A zero forcing set of a graph is a set of initially colored vertices which forces the entire graph to become colored. The zero forcing number is the cardinality of the smallest zero forcing set. Zero forcing was initially introduced to bound the maximum nullity of the family of symmetric matrices described by a graph [4]; it was also independently studied in quantum physics [20] and theoretical computer science [60], and has since found a variety of uses in physics, logic circuits, coding theory, power network monitoring, and in modeling the spread of diseases and information in social networks; see [10, 20, 22, 42, 43, 55, 61] and the bibliographies therein.

Connected zero forcing is a variant of zero forcing in which the initially colored set of vertices induces a connected subgraph. The connected zero forcing number of a graph is the cardinality of the smallest connected set of initially colored vertices which forces the entire graph to be colored (i.e., the smallest connected zero forcing set). Applications and various structural and computational aspects of connected zero forcing have been investigated in [15, 16, 17]; in particular, it can be used for modeling the spread of ideas or diseases originating from a single connected source in a network, or for power network monitoring accounting for the cost of supporting infrastructure. Other variants of zero forcing, such as positive semidefinite zero forcing [11, 33, 44, 57], fractional zero forcing, signed zero forcing [41], and $k$ -forcing [5, 48] have also been studied. These are typically obtained by modifying the zero forcing color change rule, or adding certain restrictions to a zero forcing set. The number of timesteps in the zero forcing process after which a graph becomes colored is also a problem of interest (see, e.g., [14, 21, 28, 43, 57]). Connected variants of other graph problems – such as connected domination and connected power domination [25, 31, 36, 40] – have been extensively studied as well.

A closely related problem to zero forcing is power domination, where given a set $S$ of initially colored vertices, the zero forcing color change rule is applied to $N[S]$ instead of to $S$ . Integer programming formulations for power domination and its variants have been explored in [1, 18]. The power domination problem is derived from the phase measurement unit (PMU) placement problem in electrical engineering, which has also been studied extensively; see, e.g., [47, 49] and the bibliographies therein for various integer programming models and combinatorial algorithms for the PMU placement problem. Another closely related problem to zero forcing is the target set selection problem, where given a set $S$ of initially colored vertices and a threshold function $\theta:V(G)\rightarrow\mathbb{Z}$ , all uncolored vertices $v$ that have at least $\theta(v)$ colored neighbors become colored. Thus, the zero forcing problem constrains the infectors, but the target set selection problem constrains the infectees. See [3, 13, 27] for computational approaches of finding the smallest target set $S$ of initially colored vertices which causes the entire graph to become colored.

Computing the zero forcing number and connected zero forcing number of a graph are both NP-complete problems [2, 15, 60]; nevertheless, it is important to develop practical algorithms for solving these problems, at least on moderately-sized graphs. The state-of-the-art approach for computing the zero forcing number of a graph is a combinatorial algorithm called Wavefront, developed by Butler et al. [23] (a version of this algorithm, altered for a related problem, appears in Butler et al. [24]). While this algorithm is the best available for the zero forcing problem, it is not flexible and cannot accommodate additional constraints, such as assuring connectivity of the solution or limiting the number of timesteps used to force the graph. A lot of effort has been put into developing closed formulas, efficient algorithms, characterizations, and bounds for the zero forcing numbers of graphs with special structure (see, e.g., [4, 12, 15, 16, 32, 34, 45, 50]), but relatively little progress has been made on developing computational methods for general graphs.

1.1 Main Contributions

In this paper, we explore approaches for computing the zero forcing number of a graph using integer programming. In particular, we present formulations of zero forcing based on two different perspectives – one as a dynamic process, and the other as a set-covering problem. We explore several solution strategies of these models – such as direct computation, constraint generation, and generation of facet-inducing constraints – and compare their performance to Wavefront on different types of graphs.

We also propose a combinatorial algorithm for computing the connected zero forcing number of a graph, and extend the proposed integer programming models to the connected zero forcing problem by adding connectivity constraints. In doing so, we explore several different types of connectivity constraints which have been used in problems like connected domination, Steiner trees, and forest planning. Until now, there have not been any computational approaches for connected zero forcing of general graphs other than brute-force computation.

Finally, we adapt one of our integer programs to find zero forcing sets which force the graph within a specified number of timesteps, and have minimum cardinality among all such sets. To our knowledge, there have not been any previously-implemented algorithms for this problem (though some models have been proposed for the related problem of power domination [1]).

Our computational experiments show that our integer programming models are generally comparable to the Wavefront algorithm in sparse random graphs, and are superior to Wavefront in graphs corresponding to electrical power grids and other standard benchmark networks. Our proposed integer programming models for connected zero forcing significantly outperformed the combinatorial brute force and branch-and-bound algorithms. Moreover, in some cases, these approaches were faster, and able to handle larger graphs, than the Wavefront algorithm and the zero forcing analogues of the integer programs. This is somewhat surprising, since the connected variants of problems like domination and power domination have typically proven more difficult to solve computationally, due to their non-locality (see [40] for more details). Since the connected zero forcing number is an upper bound to the zero forcing number, the proposed approaches for connected zero forcing can be used to obtain upper bounds or approximations to the zero forcing number, especially for graphs which are too large for Wavefront.

The paper is organized as follows. In the next section, we recall some graph theoretic notions, specifically those related to zero forcing. In Section 3, we present combinatorial approaches for computing the zero forcing number and connected zero forcing number of a graph; in Section 4, we present integer programming approaches for these problems. In Section 5, we describe the implementation of our proposed approaches, and compare them through computational experiments on various types of graphs. We conclude with some final remarks and open questions in Section 6.

2 Preliminaries

A graph $G=(V,E)$ consists of a vertex set $V$ and an edge set $E$ of two-element subsets of $V$ . In this paper, we consider simple graphs, for which a subset $\{v,w\}\in E$ must have $v\neq w$ and $E$ contains at most one copy of $\{v,w\}$ . The order and size of $G$ are denoted by $n=|V|$ and $m=|E|$ , respectively. Two vertices $v,w\in V$ are adjacent, or neighbors, if $\{v,w\}\in E$ . The neighborhood of $v\in V$ is the set of all vertices which are adjacent to $v$ , denoted $N(v)$ ; the closed neighborhood of $v$ , denoted $N[v]$ , is the set $N(v)\cup\{v\}$ . Similarly, given $S\subset V$ , $N(S)$ denotes the set $(\bigcup_{v\in S}N(v))\backslash S$ , and $N[S]$ denotes the set $N(S)\cup S$ . The degree of $v\in V$ is defined as $d(v)=|N(v)|$ . Given $S\subset V$ , the induced subgraph $G[S]$ is the subgraph of $G$ whose vertex set is $S$ and whose edge set consists of all edges of $G$ which have both endpoints in $S$ . For other graph theoretic terminology and definitions, we refer the reader to [59].

Given a graph $G=(V,E)$ and a set $S\subset V$ of initially colored vertices, the color change rule dictates that at each integer-valued timestep, a colored vertex $u$ with a single uncolored neighbor $v$ forces that neighbor to become colored. The closure of $S$ , denoted $\emph{cl}(S)$ , is the set of colored vertices obtained after the color change rule is applied until no new vertex can be forced; it can be shown that the closure of $S$ is uniquely determined by $S$ (see [4]). A zero forcing set is a set whose closure is all of $V$ ; the zero forcing number of $G$ , denoted $Z(G)$ , is the minimum cardinality of a zero forcing set. A chronological list of forces associated with a zero forcing set $Z$ is a sequence of forces applied to obtain the closure of $Z$ in the order they are applied. A forcing chain for a chronological list of forces is a maximal sequence of vertices $(v_{1},\ldots,v_{k})$ such that $v_{i}$ forces $v_{i+1}$ for $1\leq i\leq k-1$ . Each forcing chain is a distinct path in $G$ , one of whose endpoints is an initially colored vertex; the other is called a terminal vertex. See Figure 1 for an illustration. A fort, defined by Fast and Hicks [37], is a non-empty set $F\subset V$ such that no vertex outside $F$ is adjacent to exactly one vertex in $F$ . In Figure 1, the sets $\{1,4,5,7\}$ and $\{2,3,6\}$ are forts. A zero forcing set restrained by $S\subset V$ is a zero forcing set which contains $S$ ; $Z(G;S)$ denotes the cardinality of the smallest zero forcing set restrained by $S$ (cf. [17]).

A connected zero forcing set of $G$ is a zero forcing set of $G$ which induces a connected subgraph. The connected zero forcing number of $G$ , denoted $Z_{c}(G)$ , is the cardinality of a minimum connected zero forcing set of $G$ . For short, we may refer to these as connected forcing set and connected forcing number. Note that a disconnected graph cannot have a connected forcing set.

An empty graph is a graph with no edges. A graph is cubic if all of its vertices have degree 3. Let $C(n,k)$ be the graph with vertex set $\{0,\ldots,n-1\}$ and edge set $\{\{i,j\}:0\leq i,j\leq n-1,|i-j|\leq k/2\}$ . AWatts-Strogatz graph with parameters $n$ , $k$ , and $\beta$ refers to a graph obtained from $C(n,k)$ by replacing each edge of $C(n,k)$ with probability $\beta$ by a randomly chosen edge. When $n$ is clear from the context (or for a predetermined set of values of $n$ ), we will refer to the Watts-Strogatz graphs with parameters $n$ , $k$ , and $\beta$ as WS $(k,\beta)$ . Watts-Strogatz graphs were introduced in [58], and are a popular random graph model; they are meant to have small-world properties such as short average path lengths and high clustering. Finally, we will use the notation $[n]$ to represent the set $\{1,\ldots,n\}$ .

3 Combinatorial Approaches

In this section, we describe several combinatorial approaches for computing the zero forcing and connected forcing numbers of a graph $G=(V,E)$ . The trivial, or brute force, approach for finding a minimum zero forcing set of $G$ is to iteratively compute the closures of all subsets of $V$ of size $i$ , starting from $i=1$ and incrementing $i$ , until a zero forcing set is found. Similarly, to find a minimum connected forcing set, one could again generate subsets of vertices of increasing size, check whether each set induces a connected subgraph, and stop when the first connected set whose closure is $V$ is found. The closure of a set of vertices can be found in $O(m+n)$ time using Algorithm 1.

Proposition 1

Let $G=(V,E)$ be a graph and $Z\subset V$ . Algorithm 1 finds $cl(Z)$ in $O(m+n)$ time.

Proof 1

Algorithm 1 maintains an array “ $\operatorname{colored}$ ” which indicates whether a vertex is colored, an array “ $\operatorname{count}$ ” which counts the number of colored neighbors a vertex has, and a Stack containing active vertices, i.e., colored vertices which have a single uncolored neighbor. After the first two for-loops, the Stack contains all active vertices. In each iteration of the while-loop, an active vertex $u$ forces its uncolored neighbor $v$ and is removed from the Stack. The only inactive vertices which may become active as a result of $v$ becoming colored are the vertices in $N[v]$ ; all of these vertices are checked in the while-loop, and any active vertices among them are added to the Stack. Since one vertex is removed from the Stack (and can never re-enter the Stack) in each iteration of the while-loop, Algorithm 1 terminates. When the Stack is empty, there are no more active vertices; thus, no more forces are possible, and the set of colored vertices at the end of the while-loop is exactly $cl(Z)$ , as desired.

Lines 1—6 can be executed in $O(n+m)$ time since the neighborhood of each vertex is considered at most once. For each vertex $x$ that enters and exits the Stack, $O(d(x))$ operations are performed at most twice: once when $x$ is the unique uncolored neighbor of some other vertex (i.e., when $x=v$ in lines 11—16), and the second time when searching for the unique uncolored neighbor of $x$ (i.e., when $x=u$ in line 9). Thus, since each vertex enters and exits the Stack at most once, the total runtime of Algorithm 1 is $O(m+n)+O(\sum_{x\in V}d(x))=O(n+m)$ . $\Box$

The brute force approach works well when the graph is known a priori to have a very small or very large forcing number (in the latter case, one would start from $i=n$ , decrement $i$ as soon as a forcing set is found, and stop when all sets of vertices of a certain size are not forcing). Similarly, the brute force approach can be used in conjunction with theoretical bounds on the forcing number in terms of other efficiently-computable parameters (see, e.g., [4, 12, 15, 16, 54]). In particular, if it is determined that $k_{1}\leq Z_{c}(G)\leq k_{2}<\frac{n}{2}$ , it can be checked whether each of the $\binom{n}{k_{1}}+\cdots+\binom{n}{k_{2}}$ sets of vertices of appropriate size is connected and forcing in $O(m+n)$ time, so $Z_{c}(G)$ can be computed in $O((k_{2}-k_{1})n^{2+k_{2}})$ time (the same applies to $Z(G)$ ). Other advantages of the brute force approach are that it is easy to implement, uses little memory, and can be easily parallelized, since closures of different sets of vertices can be computed independently.

Nevertheless, in practice, the brute force algorithm is usually outperformed by the other algorithms discussed in the sequel. Section 3.1 describes the Wavefront algorithm – a dynamic programming style improvement of the brute force algorithm, which stores minimum forcing sets of certain subgraphs of $G$ and uses them to build minimum forcing sets of larger subgraphs. Thus, it avoids checking all possible subsets of vertices at the expense of increased memory. Section 3.2 gives a branch-and-bound style improvement of the brute force algorithm for connected forcing; instead of generating all subsets of vertices and checking whether they are connected and forcing, this algorithm generates only connected subgraphs, checks whether they are forcing, and prunes the search tree based on the best zero forcing set found.

3.1 Wavefront Algorithm

In this section, we give a description of the combinatorial algorithm for zero forcing known as Wavefront, developed by Butler et al. [23]. To our knowledge, this algorithm is the only previously-implemented computational method for the zero forcing problem (aside from brute force), and a proof of its correctness does not appear elsewhere in print. We prove that the Wavefront algorithm is correct in Theorem 4 and give a result about its worst-case memory requirements in Theorem 5.

Lemma 2

Let $Z$ be a minimum zero forcing set of a graph $G=(V,E)$ . Then, for any $S\subset Z$ , there does not exist a set $R\subset V$ with $|R|<|S|$ and $cl(S)\subset cl(R)$ .

Proof 2

Suppose for contradiction that there exists a set $R\subset V$ with $|R|<|S|$ and $cl(S)\subset cl(R)$ . Then, since $S\subset cl(S)\subset cl(R)$ , it follows that the vertices in $R$ can force all the vertices in $S$ after some number of timesteps. Since $Z$ is a zero forcing set and $R$ is capable of forcing all the vertices in $S$ , it follows that $(Z\backslash S)\cup R$ is a zero forcing set with cardinality smaller than $Z$ ; this is a contradiction. $\Box$

Given a graph $G=(V,E)$ , a closure pair of $G$ is an ordered pair $(S,r)$ where $S$ is the closure of some subset of $V$ , and $r$ is the cardinality of a subset of $V$ whose closure is $S$ . We will show that each element of the set $\mathcal{C}$ in the Wavefront algorithm is a closure pair. Note that a set whose closure is $S$ and whose cardinality is $r$ is not explicitly identified or stored in the algorithm; indeed there could be many sets with the same closure and the same cardinality.

Lemma 3

Let $G=(V,E)$ be a graph. At each step of the Wavefront algorithm applied to $G$ , each element of the set $\mathcal{C}$ is a closure pair of $G$ .

Proof 3

We will prove the claim by induction on $R$ . In line 1, $\mathcal{C}$ is initialized as a set containing a closure pair. Suppose that for some $R\geq 0$ , all elements of $\mathcal{C}$ are closure pairs, and consider the next iteration of the loop on line 2 which increments $R$ . $\mathcal{C}$ is updated only in line 8, when an ordered pair $(S^{\prime},r^{\prime})$ is added to $\mathcal{C}$ . Moreover, in order for $(S^{\prime},r^{\prime})$ to be added to $\mathcal{C}$ , the if-statement in line 7 has to be satisfied, i.e., $r^{\prime}\leq R$ . Thus, since the value of $R$ increases in the loop on line 2, the second element of each closure pair in $\mathcal{C}$ is no more than the current value of $R$ . In lines 3 and 4, the algorithm loops over the elements of $\mathcal{C}$ and $V$ . Thus, in order to show that the elements of $\mathcal{C}$ are always closure pairs, it is sufficient to show that for a fixed $R\geq 1$ , and for an arbitrary iteration of the loops over $\mathcal{C}$ and $V$ , the resulting $(S^{\prime},r^{\prime})$ which is added to $\mathcal{C}$ is a closure pair. Fix an arbitrary closure pair $(S,r)\in\mathcal{C}$ and an arbitrary vertex $v\in V$ ; this corresponds to fixing an arbitrary iteration of the loops. $S^{\prime}$ is updated only in line 5, and by definition it is the closure of the set $S\cup N[v]$ . Likewise, $r^{\prime}$ is updated only in line 6, and by definition $r^{\prime}=r+|\{v\}\backslash S|+\max\{|N(v)\backslash S|-1,0\}$ .

Suppose first that $N[v]\subset S$ . Then, $S^{\prime}=cl(S\cup N[v])=cl(S)=S$ . Since $(S,r)\in\mathcal{C}$ and $r\leq R$ , the if-statement in line 7 would be false, so $(S^{\prime},r^{\prime})$ would not be added to $\mathcal{C}$ . Now suppose that $N[v]$ is not fully contained in $S$ and that $v$ has a neighbor $x$ outside $S$ . Since $(S,r)$ is a closure pair, there exists a set $A\subset V$ such that $cl(A)=S$ and $|A|=r\leq R$ . Let $A^{\prime}=A\cup((N[v]\backslash S)\backslash\{x\})$ . Then, $cl(A^{\prime})=cl(S\cup N[v])$ , since $A$ can force $S$ , after which all-but-one neighbors of $v$ will be colored, after which $x$ can be forced by $v$ , after which all remaining vertices in $cl(S\cup N[v])$ can be forced by $S\cup N[v]$ . Moreover, $|A^{\prime}|=|A|+|N[v]\backslash S|-1=r+|\{v\}\backslash S|+|N(v)\backslash S|-1=r^{\prime}$ . Thus, $(S^{\prime},r^{\prime})$ is a closure pair, regardless of whether or not it gets added to $\mathcal{C}$ . Finally, suppose that $N[v]$ is not fully contained in $S$ and that $v$ does not have a neighbor outside $S$ , i.e., that $v\notin S$ but $N(v)\subset S$ . Let $A^{\prime}=A\cup\{v\}$ . Then, $cl(A^{\prime})=cl(A\cup\{v\})=cl(S\cup N[v])$ , since $A$ can force $S$ , after which all vertices in $N[v]$ will be colored, after which all remaining vertices in $cl(S\cup N[v])$ can be forced by $S\cup N[v]$ . Moreover, $|A^{\prime}|=|A|+1=r+|\{v\}\backslash S|=r^{\prime}$ . Hence, $(S^{\prime},r^{\prime})$ is a closure pair, regardless of whether or not it gets added to $\mathcal{C}$ . Thus, in every step of the algorithm, $\mathcal{C}$ is a set of closure pairs. $\Box$

Theorem 4

Given a graph $G=(V,E)$ , the Wavefront algorithm returns $Z(G)$ .

Proof 4

By Lemma 3, at each step of the algorithm, $\mathcal{C}$ is a set of closure pairs of $G$ . Since $G$ has a finite number of closure pairs, each loop of the algorithm is over a finite set, so the algorithm terminates. Note that at any step of the algorithm, $r^{\prime}$ on line 6 is at most $n$ . Thus, for $R$ large enough to satisfy the if-statement on line 7, in line 5 a neighborhood can be added to the largest closure of a closure pair in $\mathcal{C}$ , creating a larger closure; eventually, all vertices can be added, so the largest closure of a closure pair in $\mathcal{C}$ will be $V$ . Thus, the algorithm always returns a number. Let $r^{*}$ be the number returned by the algorithm, and let $Z^{*}$ be a minimum zero forcing set of $G$ .

If $|Z^{*}|=n$ we are done, because the number $r^{*}$ returned by the algorithm always satisfies $Z(G)\leq r^{*}\leq R\leq n$ . Thus, suppose henceforth that $Z^{*}\neq V$ , so $Z^{*}$ contains some vertex $v$ together with all-but-one of its neighbors. Let $S_{0}=N[v]\cap Z^{*}$ . By Lemma 2, $S_{0}$ is a minimum cardinality set of vertices whose closure contains $cl(S_{0})$ . Since $S_{0}$ consists of a single vertex and all-but-one of its neighbors, after at most $|S_{0}|$ iterations of the loop on line 2, the if-statement on line 7 will be true and $(cl(S_{0}),|S_{0}|)$ will be added to $\mathcal{C}$ . If $cl(S_{0})=V$ then the algorithm would terminate and return $|S_{0}|=|Z^{*}|$ , so assume that $cl(S_{0})\neq V$ .

Let $S$ be a maximum cardinality subset of $Z^{*}$ such that $(cl(S),|S|)$ is added to $\mathcal{C}$ at some step of the algorithm. Note that $S\neq\emptyset$ since $|S|\geq|S_{0}|>0$ . If $cl(S)=V$ then the algorithm would terminate at the step when $(cl(S),|S|)$ is added to $\mathcal{C}$ and return $|S|=|Z^{*}|$ , so assume that $cl(S)\neq V$ .

If $V\backslash cl(S)\subset Z^{*}$ , then $cl(S)\cap Z^{*}=S$ , since otherwise $(Z^{*}\backslash cl(S))\cup S$ would be a smaller zero forcing set than $Z^{*}$ ; thus, $|Z^{*}|=|S|+|V\backslash cl(S)|$ . Any closure pair that is built by starting from the closure pair $(cl(S),|S|)$ and repeatedly adding neighborhoods of vertices in line 5 (and storing the intermediate closure pairs in $\mathcal{C}$ in line 8) can have its second entry be at most $|S|+|V\backslash cl(S)|=|Z^{*}|$ ; this is because no vertices from $cl(S)$ can be added at any stage, since then a closure pair with a smaller second entry would have the same closure and the if-statement on line 7 would be false, not allowing the closure pair to be added to $\mathcal{C}$ . Thus, a closure pair $(V,|S|+|V\backslash cl(S)|)=(V,|Z^{*}|)$ will be added to $\mathcal{C}$ . Moreover, no closure pair $(V,q)$ with $q<|Z^{*}|$ will be added to $\mathcal{C}$ since that would imply there is a zero forcing set smaller than $Z^{*}$ . Finally, no closure pair $(V,q)$ with $q>|Z^{*}|$ will be added to $\mathcal{C}$ , because of line 7 and because the loop on line 2 increments the values of $R$ . Thus, in this case the number returned by Wavefront is $r^{*}=|Z^{*}|$ .

Now suppose that $V\backslash cl(S)\not\subset Z^{*}$ ; let $w$ be the first vertex outside $Z^{*}\cup cl(S)$ to get forced by $Z^{*}$ , and let $u$ be the vertex which forces $w$ (given some fixed chronological list of forces). Then, since $Z^{*}$ is a minimum zero forcing set and $S\subset Z^{*}$ , $Z^{*}$ cannot contain a vertex of $cl(S)$ which is not in $S$ . Since $w$ is the first vertex forced outside of $Z^{*}\cup cl(S)$ , it follows that $N[u]\backslash\{w\}\subset Z^{*}$ . Thus,

[TABLE]

see Figure 2 for an illustration. Since $w\notin cl(S)$ , there must exist at least one vertex in $N[u]\backslash\{w\}$ that is not in $cl(S)$ and therefore that vertex must be in $Z^{*}$ . Thus,

[TABLE]

Note that $cl(N[u]\cup cl(S))=cl(Z^{*}\cap(S\cup N[u]))$ and by (1), $|S|+|N[u]\backslash cl(S)|-1=|Z^{*}\cap(S\cup N[u])|$ . Thus, $(cl(N[u]\cup cl(S)),|S|+|N[u]\backslash cl(S)|-1)$ is a closure pair. Suppose this closure pair was added to $\mathcal{C}$ in some step of the algorithm. Then, we would have a set $\hat{S}=Z^{*}\cap(S\cup N[u])\subset Z^{*}$ that is larger than $S$ , by (2), and yet $(cl(\hat{S}),|\hat{S}|)$ is added to $\mathcal{C}$ . This contradicts the assumption that $S$ was the largest such set.

Thus, suppose $(cl(N[u]\cup cl(S)),|S|+|N[u]\backslash cl(S)|-1)$ is never added to $\mathcal{C}$ at any step of the algorithm. The algorithm cannot terminate when $R<|S|+|N[u]\backslash cl(S)|-1$ , since that would mean that $r^{*}\leq R<|S|+|N[u]\backslash cl(S)|-1\leq|Z^{*}|\leq|Z^{*}|$ (where the first inequality follows from line 7, and the last inequality follows from (1)), and since $(V,r^{*})$ is a closure pair, there would have to exist a zero forcing set of cardinality less than $|Z^{*}|$ , a contradiction. At the iteration $R=|S|+|N[u]\backslash cl(S)|-1$ of the loop on line 2, since $(cl(S),|S|)$ is in $\mathcal{C}$ , $S^{\prime}=cl(cl(S)\cup N[u])$ will be created in line 5, and

[TABLE]

will be created in line 6.

If a closure pair $(cl(Q),|Q|)$ such that $cl(Q)=S^{\prime}$ and $|Q|\leq R$ is already in $\mathcal{C}$ before step $R$ , then $|Q|<|S|+|N[u]\backslash cl(S)|-1$ . Note that $(Z^{*}\backslash(cl(S\cup N[u])))\cup Q$ is a zero forcing set since $cl(Q)=cl(cl(S)\cup N[u])=cl(S\cup N[u])$ , so all the vertices that are removed from $Z^{*}$ can be forced by $Q$ . Moreover, by (1), $Z^{*}$ contains $|S|+|N[u]\backslash cl(S)|-1$ vertices of $S\cup N[u]$ , and since $S\cup N[u]\subset cl(S\cup N[u])$ , $Z^{*}$ contains at least $|S|+|N[u]\backslash cl(S)|-1$ vertices of $cl(S\cup N[u])$ . Thus,

[TABLE]

a contradiction. Thus, the closure pair $(cl(Q),|Q|)$ does not exist, so the if-statement on line 7 is true and $(cl(N[u]\cup cl(S)),|S|+|N[u]\backslash cl(S)|-1)$ must is added to $\mathcal{C}$ ; this is a contradiction. $\Box$

Theorem 5

At any step $s$ of the Wavefront algorithm,

[TABLE]

and this bound is tight.

Proof 5

Multiple sets that have the same closure are not added to $\mathcal{C}$ . Since all permutations of a set have the same closure, at most ${n\choose s}$ new sets can be added to $\mathcal{C}$ in step $s$ . Thus, after step $s$ , $|\mathcal{C}|$ is bounded as in (3). The worst case performance of Wavefront is realized in empty graphs (after $n$ steps), since the closure of any set of vertices is the set itself, and since every combination of vertices must be checked before a zero forcing set is found. $\Box$

As shown in Theorem 5, in the worst case, the Wavefront algorithm is no better than enumerating all possible subsets of vertices; graphs in which very few vertices can be forced by sets with fewer than $Z(G)$ elements (e.g. stars) also lead to poor performance. However, Wavefront performs much better than the brute force algorithm when the closures of subsets of vertices are larger than the original subsets. This improvement comes from the fact that when some vertices being forced have no uncolored neighbors, they are no longer possible choices to add to the sets in $\mathcal{C}$ .

While Wavefront could potentially be modified to create connected forcing sets, such a modification would eliminate the computational advantages of the algorithm. Wavefront only stores optimal forcing sets for certain subgraphs of $G$ , and then builds the optimal forcing sets of larger subgraphs by adding neighborhoods of vertices containing an uncolored vertex. However, an optimal forcing set for a subgraph may not be connected to other vertices that must be added in order to force the entire graph. Thus, to be useful for finding connected forcing sets, Wavefront would have to store more than just the optimal forcing set for each subgraph, and its performance would suffer as a result. For these reasons, Wavefront will not be a viable method for solving the connected zero forcing problem without significant alterations.

3.2 Branch-and-Bound Algorithm

We now present a combinatorial branch-and-bound algorithm for connected zero forcing; this algorithm is based on a variant of reverse search. It was shown by Avis and Fukuda [7] that the reverse search technique generates all connected induced subgraphs of a graph; we include a proof below for completeness.

Theorem 6

Given a graph $G=(V,E)$ , Algorithm 3 returns $Z_{c}(G)$ .

Proof 6

In Algorithm 3 and throughout this proof, $S$ denotes the vertex set of a partially constructed subgraph, $R$ denotes the set of vertices not yet considered, and $C$ denotes the set of candidate vertices that could be added to $S$ . Algorithm 3 is based on the following Subgraph function, which enumerates all subgraphs of $G$ . In particular, this function recursively adds or does not add to $S$ a vertex $v\in R$ which has not yet been considered. When initialized with $R=V$ and $S=\emptyset$ , this process defines a binary tree $T$ , where the choice of whether or not $v$ gets added to $S$ gives two branches of a subtree of $T$ descending from the node representing $S$ . Then, the leaves of $T$ are the subgraphs of $G$ , and are visited depth-first by the recursion.

To generate the connected induced subgraphs of $G$ , the Subgraph function can be modified into the ConnetedSubgraph function as follows: instead of choosing any vertex $v\in R$ to branch on, the vertex $v$ is chosen to be in $C=R\cap N(S)$ (and in the first level of the search tree, when $S=\emptyset$ , the choice of $v$ is unconstrained). This assures that at each step, $S$ is connected, and the leaves of the search tree $T$ are the connected induced subgraphs of $G$ . If the set of candidate vertices $C$ is empty, there are no other connected induced subgraphs in the subtree descending from $S$ , so that subtree is pruned.

Finally, to visit the minimum connected forcing sets of $G$ , in the ConnectedForcingSet function of Algorithm 3, each subtree of $T$ only includes subsets of vertices that are larger than the subset represented by the root of the subtree. Thus, once a connected zero forcing set of size $\ell$ is found, all subsequent subtrees that lead to subsets of size at least $\ell$ can be pruned. Hence, the correctness of Algorithm 3 follows from the fact that all connected induced subgraphs whose order is less than or equal to the cardinality of an already-discovered connected zero forcing set are enumerated. $\Box$

Since a graph $G$ may have exponentially-many connected induced subgraphs with fewer than $Z_{c}(G)$ vertices, in the worst case this algorithm is no better than brute force. However, as with Wavefront, Algorithm 3 performs much better in practice, when $G$ has relatively few connected induced subgraphs or when large parts of the search tree are pruned.

4 Integer Programming Approaches

In this section, we describe several integer programming formulations and solution strategies for computing the zero forcing and connected forcing numbers of a graph, and for finding the smallest set which forces a graph within a specified number of timesteps. The presented formulations come from two distinct perspectives on zero forcing. The first perspective is a straightforward model of zero forcing as a dynamic graph infection process. This approach incorporates the dynamic nature of the forcing process by using the vertices forced at each timestep to determine the vertices that can be forced in the next timestep. The second perspective uses the theory of zero forcing forts introduced by Fast and Hicks [37], and models zero forcing as a type of set-covering problem which does not depend on timesteps.

4.1 Infection Perspective

In the formulation of zero forcing as a dynamic process, each edge of the given graph $G=(V,E)$ is replaced by two directed edges with opposite directions. A binary variable $s_{v}$ indicates whether vertex $v$ is in the forcing set; an integer variable $x_{v}$ ranging in $\{0,\ldots,T\}$ indicates at which timestep vertex $v$ is forced, where $T$ is the maximum difference between the forcing times of two vertices; finally, a binary variable $y_{e}$ for each directed edge $e=(u,v)$ indicates whether $u$ forces $v$ . The notation $\delta^{-}(v)$ refers to the set of edges pointing towards node $v$ .

Theorem 7

The optimum of Model 1 is equal to $Z(G)$ .

Proof 7

Let $Z$ be a zero forcing set, fix some chronological list of forces, and let $\mathcal{F}$ be the associated set of forcing chains. Since $Z$ is a zero forcing set, each vertex $v$ of $G$ is either in $Z$ (i.e. $s_{v}=1$ ) or is forced by some other vertex of $G$ (i.e. $y_{e}=1$ and $v$ is the head of $e$ ). Thus, constraint (1) must be satisfied. Now, let $x_{v}$ be the timestep in which $v$ is forced. Since a vertex cannot force until all-but-one of its neighbors are forced, it follows that for every edge $e=(v,w)$ for which $y_{e}=1$ , $v$ must be forced before $w$ and thus $x_{v}<x_{w}$ . Likewise, $x_{i}<x_{w}$ for all neighbors $i$ of $v$ . Thus, constraints (1) and (1) are satisfied. If $y_{e}=0$ , then constraints (1) and (1) are satisfied since $T$ is the maximum difference between the forcing times of two vertices. Thus, the constraints are valid for any zero forcing set and associated set of forcing chains.

Conversely, let $(s,x,y)$ be a feasible solution of Model 1 for the given graph $G$ , and let $Z$ be the set of all vertices for which $s_{v}=1$ . By constraint (1), each vertex $v$ is either in $Z$ , or is the head of exactly one edge $e=(u,v)$ with $y_{e}=1$ . Consider an edge $e=(u,v)$ for which $y_{e}=1$ ; by constraints (1) and (1), there must be some integer $x_{v}\in\{0,\ldots,T\}$ such that $x_{i}+1\leq x_{v}$ for $i\in N[u]\backslash\{v\}$ . By interpreting the $x_{i}$ variable as the timestep in which the vertex $i$ is forced for each vertex with $s_{i}=0$ , it follows that there must exist some timestep $x_{v}\in\{0,\ldots,T\}$ such that if $v$ is forced by $u$ in timestep $x_{v}$ , then $u$ and all its neighbors except $v$ have been forced in previous timesteps. Thus, each vertex is the tail of at most one edge $e$ with $y_{e}=1$ , and so the edges for which $y_{e}=1$ define a set $\mathcal{F}$ of directed paths which have one end-vertex in $Z$ . Since every vertex of $G$ is either in $Z$ or is the head of an edge with $y_{e}=1$ and is therefore part of a path in $\mathcal{F}$ , it follows that $Z$ is a zero forcing set of $G$ and $\mathcal{F}$ is a set of forcing chains associated with $Z$ . $\Box$

Note that for any graph, since at least one vertex is forced at each timestep, it follows that $T<n$ ; it is possible to further bound $T$ if the graph is assumed to have certain properties (see, e.g., [21, 28, 37, 43, 57] for results on the propagation time $T$ ).

The main advantage of Model 1 is that the numbers of constraints and variables in this formulation are polynomial in $n$ ; therefore, the integer program can be solved directly, without delayed row or column generation. Another useful feature of this model is that it not only finds the zero forcing number and a minimum zero forcing set of $G$ , but it also gives a set of forcing chains associated with the forcing set. However, the downfall of Model 1 is its reliance on constraints (1) and (1), which are of big- $M$ form; these constraints lead to poor performance for this model (except on very sparse graphs, as shown in Section 5).

Model 1 is easily adapted for a different purpose – namely, for finding a zero forcing set which forces $G$ within a specified number of timesteps, and has minimum cardinality subject to that property. This result can be achieved by fixing $T$ in Model 1 to be the maximum acceptable number of timesteps to force the graph. As mentioned above, the zero forcing propagation time has been previously investigated (cf. [21, 28, 37, 43, 57]) from a combinatorial standpoint, but to our knowledge, this is the first computational tool for this problem for general graphs. Our experiments show that smaller $T$ values cause Model 1 to solve faster and handle larger graphs.

4.2 Fort Covering Perspective

Our next formulation models zero forcing as a set-covering problem which does not depend on timesteps and does not rely on big- $M$ constraints. In Model 2, the binary variable $s_{v}$ again indicates whether vertex $v$ is in the zero forcing set; $\mathcal{B}$ is the set of all forts in the given graph $G=(V,E)$ .

Theorem 8

The optimum of Model 2 is equal to $Z(G)$ .

Proof 8

Let $s$ be a feasible solution of Model 2, and let $Z$ be the set of all vertices $v$ for which $s_{v}=1$ . Suppose for contradiction that $Z$ is not a zero forcing set of $G$ , which means $cl(Z)\neq V$ . If any vertex $u\in cl(Z)$ was adjacent to exactly one vertex $v\in V\backslash cl(Z)$ , then $u$ could force $v$ , contradicting the definition of $cl(Z)$ . Thus, $V\backslash cl(Z)$ is a fort, and it does not contain any vertex of $Z$ ; this means constraint (2) is violated. It follows that $Z$ must be a zero forcing set of $G$ .

Conversely, let $Z$ be a zero forcing set of $G$ . Suppose for contradiction that there exists a fort $F$ which does not contain any element of $Z$ . In order for the first vertex $v$ of $F$ to be forced, at some timestep, $v$ must be the only neighbor of some colored vertex outside $F$ . However, since $F$ is a fort, any vertex outside $F$ which is adjacent to $v$ is also adjacent to another (uncolored) vertex in $F$ . Thus, $v$ cannot be forced, which contradicts $Z$ being a forcing set. It follows that every fort contains an element of $Z$ , so $Z$ is a feasible solution of Model 2. $\Box$

In contrast to Model 1, Model 2 has the advantage of not having big- $M$ constraints. However, the main issue with Model 2 is that since a graph could have an exponential number of forts (in the sense that there exist families of graphs with $n$ vertices and $\Omega(2^{n})$ forts, e.g., $K_{n}$ ), the solution methodologies for Model 2 must use constraint generation (see [30] or [52] for an introduction to this approach). In constraint generation, a relaxed master problem (RMP) is obtained by omitting a set of constraints (in this case, the omitted constraints are the fort cover constraints (2)); then, the RMP is solved, a set of violated constraints from the full model is added to the RMP, and this process is repeated until there are no more violated constraints.

The usefulness of constraint generation depends on the development of a practical method for finding violated constraints. A few high-quality constraints can subsume a large number of low-quality constraints, but the former are usually expensive to find. Thus, there is a trade-off between the time spent on finding constraints, and the number of constraints that have to be found. In the remainder of this section, we consider several different methods for finding violated constraints.

A quick and naive way to generate constraints is the following: if $S$ is a solution to a RMP, the vertex complement of the closure of $S$ , i.e. $V\backslash cl(S)$ , is a violated fort cover constraint. We will refer to this method of generating constraints as the closure complement method.

Another method to find violated forts is to use the auxiliary integer program given in Model 3. For this model, we define the set $S$ to be the set of all vertices for which $s_{v}=1$ in the current optimal solution of a RMP for Model 2. Note that since the value for each $s_{v}$ is taken from the current optimal solution of the RMP, $S$ is constant for Model 3. The $x_{v}$ variables indicate whether vertex $v$ is in the fort.

Theorem 9

Let $S$ be a feasible solution of a RMP of Model 2. Model 3 finds a minimum size violated fort with respect to $S$ .

Proof 9

Let $B$ be a violated fort of $G$ ; let $x_{v}=1$ for $v\in B$ , and $x_{v}=0$ for $v\notin B$ . Since a fort is non-empty by definition, $B$ must contain at least one vertex; therefore, constraint (3) of the model is satisfied. Again by the definition of a fort, any neighbor of a vertex in $B$ must either be in $B$ or have at least one other neighbor in $B$ ; therefore, constraint (3) is satisfied. Since $B$ is a violated fort, no vertex in $B$ can be in $cl(S)$ ; therefore, constraint (3) is satisfied.

Conversely, let $x$ be a solution of Model 3, and let $B$ be the set of vertices of $G$ for which $x_{v}=1$ . By constraint (3), $B$ is not empty. By constraint (3), every neighbor of a vertex in $B$ must either be in $B$ , or have at least two neighbors in $B$ (note that if $v$ is in $B$ , then $v$ is one of the neighbors of $w$ which is in $B$ ). Thus, $B$ is a fort of $G$ . Furthermore, by constraint (3), no vertex of $B$ is in $cl(S)$ ; therefore $B$ is a violated fort with respect to the solution $S$ . $\Box$

Model 3 separates violated constraints for Model 2. There is precedent in literature for using an integer programming separation method (see, e.g., [6, 39]). In our computational experiments, Model 3 solved relatively quickly, and the forts found using this method are smaller and therefore more effective at solving Model 2 than those found by the closure complement method.

Instead of adding minimum forts using the IP in Model 3, we can instead add minimal forts, in polynomial time, by finding a maximal non-forcing set $M$ that contains a solution $S$ of the RMP; then, $V(G)\backslash M$ is a minimal fort that can be added as a constraint. To find a maximal non-forcing set $M$ that contains $S$ , a greedy approach can be used whereby $M$ is initialized as $cl(S)$ and the vertices in $V(G)\backslash M$ are searched for a vertex that can be added to $M$ without forming a forcing set; if such a vertex is found, it is added to $M$ and the search repeats until no more vertices can be added. We will call this method of generating minimal forts the maximal closure method. In some cases, the minimal forts will also be minimum, and we will get the effectiveness of the minimum forts without the cost of solving an IP. However, as graphs become larger and more complex, it becomes more likely that the minimal forts will not be minimum. These observations are borne out in our computational results, which show that the maximal closure method performs well on small graphs, but is beaten for larger graphs (in terms of number of instances that can be solved to optimality) by the models that find minimum forts.

Since Model 2 is a set-covering problem, we can use the theory of Balas and Ng [9] on the set-covering polytope to explain why the forts generated by Model 3 are more effective than those found by the closure complement method. In particular, we restate a theorem from [9] in terms of forts; this result gives necessary and sufficient conditions for inequalities with a right-hand-side of one to be facet inducing.

Theorem 10

[9]*

Given a fort $B$ , the inequality $\sum_{v\in B}s_{v}\geq 1$ defines a facet of the zero forcing polytope if and only if the following two conditions hold:*

[TABLE]

Note that in condition (12), the choice of $w$ depends on $v$ and must remain the same for all forts $A\subset B\cup\{v\}$ . Condition (11) explains why Model 3 performs better than the closure complement method. The latter makes no effort to minimize the size of the generated forts; thus, they are unlikely to satisfy (11) and be facet inducing. On the other hand, Model 3 finds minimum size violated forts, which satisfy (11). However, condition (12) is not necessarily satisfied by either method.

This motivates further investigation of ways to add facet-inducing inequalities to Model 2, which we address next. Note that if (12) is violated for a fort $F$ , then there exist $p$ forts $A_{1},\ldots,A_{p}\subset F\cup\{v\}$ such that there is a vertex $v$ with $v\in A_{i}$ , $1\leq i\leq p$ , but $\bigcap_{i\in[p]}(A_{i}\backslash\{v\})=\emptyset$ . Observe also that the fort constraints given by $F$ and $A_{i}$ for $1\leq i\leq p$ can be combined to give the valid cut $\sum_{i\in F\cup\{v\}}x_{i}\geq 2$ . This valid cut is found by Chvátal-Gomory rounding: we first sum the fort constraints corresponding to $F$ and all the $A_{i}$ ; since $\bigcap_{i\in[p]}A_{i}=\emptyset$ , the coefficient of each vertex variable in the sum is at most $p$ , but the right-hand-side is $p+1$ . Thus, the valid cut can be obtained by dividing through by $p$ and taking the ceiling of each coefficient. Moreover, the magnitude of $p$ is bounded as follows.

Theorem 11

Suppose there exist $p$ forts $A_{1},\ldots,A_{p}\subset F\cup\{v\}$ such that there is a vertex $v$ with $v\in A_{i}$ for $1\leq i\leq p$ , but $\bigcap_{i\in[p]}(A_{i}\backslash\{v\})=\emptyset$ . Then $p$ can be chosen to be at most $|F|$ .

Proof 10

Suppose there exist $q>|F|$ forts that satisfy the required properties. Then, for each vertex $w\in F$ , choose one fort among $A_{1},\ldots,A_{q}$ that does not contain $w$ ; since $\bigcap_{i\in[q]}(A_{i}\backslash\{v\})=\emptyset$ , such a fort must exist for each $w$ . This collection consists of at most $|F|$ forts, whose intersection is empty except for $v$ . Thus, it is possible to choose $p\leq|F|$ forts which satisfy the required properties. $\Box$

Given Theorem 11, Model 4 can be used to check whether a fort generated by Model 3 is facet inducing. If the generated fort is not facet inducing, then the valid cut generated as described above can be added instead of a fort constraint. In Model 4, the variable $z_{ij}$ indicates whether vertex $j$ is chosen to be in fort $i$ , and the variable $y_{i}$ indicates whether fort $i$ is empty.

Theorem 12

If Model 4 is infeasible, and $F$ is a minimum size fort, then the fort $F$ is facet inducing. If Model 4 has an optimal solution, then the set of forts with $y_{i}=1$ shows that $F$ is not facet inducing by condition (12) of Theorem 10.

Proof 11

Suppose a minimum size fort $F$ is not facet inducing. Since $F$ has minimum size, it must satisfy condition (11), so $F$ can only violate (12). Therefore, there must exist $p$ forts $A_{1},\ldots,A_{p}\subset F\cup\{v\}$ such that there is a vertex $v$ with $v\in A_{i}$ for $1\leq i\leq p$ but $\bigcap_{i\in[p]}(A_{i}\backslash\{v\})=\emptyset$ . By Theorem 11, we can assume that $p\leq|F|$ . Let $y_{i}=1$ for $1\leq i\leq p$ , and let $y_{i}=0$ otherwise; let $z_{iw}=1$ if $w$ is in fort $A_{i}$ for $1\leq i\leq p$ and $z_{iw}=0$ otherwise; let $x_{v}=1$ and $z_{iv}=1$ if $1\leq i\leq p$ . All other variables are set to [math]. Now, observe that (4) is satisfied because $x_{w}=0$ for all $w\neq v$ , and $x_{v}=1$ . Constraint (4) is satisfied because each fort $A_{i}$ contained $v$ , and (4) is satisfied because $z_{iv}$ is either [math] or $1$ and $x_{v}=1$ . Constraint (4) is satisfied because $\bigcap_{i\in[p]}(A_{i}\backslash\{v\})=\emptyset$ . Constraint (4) is satisfied because each $A_{i}$ was a fort, and (4) is satisfied because the $z$ variables are chosen to be 1 only for the forts with $y$ variables chosen to be 1. Hence, $(z,y)$ is a feasible solution to Model 4. Thus, if Model 4 is infeasible, then the fort $F$ must be facet inducing.

Conversely, if Model 4 has an optimal solution, then defining $A_{i}=\{w\in V\colon z_{iw}=1\}$ gives a set of forts which shows that $F$ does not satisfy condition (12) of Theorem 10, and hence $F$ is not facet inducing. $\Box$

Instead of guaranteeing that a fort constraint is facet inducing using Model 4, we can also determine this characteristic in a heuristic manner. In particular, Model 4 can be simplified by limiting the number of forts that can be chosen, i.e., requiring at most 2 forts instead of $|F|$ forts. Our preliminary testing showed that this simplification generally has better performance than the full model.

4.3 Fort Covering Extended

Another way to improve the formulation of Model 2 comes from the observation that any zero forcing set must contain some vertex together with all-but-one of its neighbors. This idea is expressed in Model 5, with the addition of binary variables $z_{v}$ , which indicate that vertex $v$ and all-but-one of its neighbors belong to a zero forcing set. Hence, the $z_{v}$ variables have a cost of $|N(v)|$ in the objective function, and at least one of them is required to be positive (enforced by constraint (5)). Note that if a vertex $w$ is in the closure of $N[v]$ and $z_{v}$ is positive, then $w$ will be forced in the corresponding solution. Therefore, $s_{w}$ and $z_{v}$ will never both be positive; this is enforced by constraint (5). Finally, the fort constraints (constraint (2) of Model 2) are modified to (5) to allow satisfaction by $z_{v}$ variables. Despite the increased number of variables, Model 5 performs better in our experiments than Model 2.

Given the additional variables in Model 5, the constraint generation models must also be expanded to generate violated forts. Instead of minimizing the number of vertices in the fort as in Model 3, our experiments showed better performance when we minimize the number of vertices in the fort that are adjacent to vertices outside of the fort. Such minimum border forts can be found using the integer program in Model 6. In this model, $b_{v}$ is a binary variable that indicates whether the vertex $v$ is adjacent to vertices outside of the fort. $S$ is the set containing every vertex $v$ such that either $s_{v}=1$ in the current solution, or $v$ is in $N[w]$ for some $w$ with $z_{w}=1$ in the current solution. Constraint (6) ensures that the $b_{v}$ variables correctly indicate whether $v$ is on the border of the fort.

4.4 Fort Covering for Connected Zero Forcing

In this section, we adapt the models introduced previously to the connected forcing problem, by adding constraints to enforce connectivity on the chosen zero forcing set. We focus on adding connectivity constraints to Model 2, because it is the best performing model from the previous sections that allows us to ensure connectivity. The neighborhood variables in Model 5 make it difficult to enforce connectivity since one vertex in the neighborhood is not in the zero forcing set.

Drawing from the literature on connected dominating sets and connected power dominating sets, there are multiple ways of enforcing connectivity in integer programs. Fan and Watson [36] compared Miller-Tucker-Zemlin (MTZ) constraints, Martin constraints, single-commodity flow constraints, and multi-commodity flow constraints, and found that the MTZ constraints provide the best computational performance for both the connected dominating set and connected power dominating set problems. Another method of enforcing connectivity is to add $a$ , $b$ -separation cutting planes when needed, in order to cut off disconnected solutions. This method has been used for connected dominating sets [19], Steiner trees [38], and forest planning problems [26] (see also [56] for conditions that cause such inequalities to induce facets of the connected subgraph polytope). In view of these results, we explore the effectiveness of MTZ constraints and $a$ , $b$ -separation inequalities for enforcing connectivity in the connected zero forcing problem.

MTZ constraints were originally introduced by Miller, Tucker, and Zemlin [51] in relation to the Traveling Salesman Problem. The basic idea of MTZ constraints is to enforce the existence of a directed spanning tree in the subgraph induced by the chosen vertices. Our implementation follows Fan and Watson’s [36] explanation of the method introduced by Quintāo, da Cunha, Mateus, and Lucena [53]. Two new vertices labeled $\alpha$ and $\beta$ are added to the given graph $G=(V,E)$ , along with a set $E_{new}$ of edges containing a directed edge from each of the two new vertices to all the original vertices; $E_{new}$ also contains a directed edge from $\alpha$ to $\beta$ . In the modified graph, the vertices which are not chosen to be in a forcing set will have a positive edge variable coming into them from $\alpha$ , while $\beta$ will have a positive edge variable going to the root of the directed spanning tree of the chosen connected zero forcing set.

Model 7 combines Model 2 with the MTZ constraints. In Model 7, (7) is the original fort cover constraint from Model 2; the rest of the constraints are the MTZ constraints. In particular, (7) ensures that there is an edge chosen from $\beta$ to some vertex that will be the root of the directed spanning tree of the zero forcing set; (7) ensures that each vertex has an incoming edge. Constraint (7) ensures that vertices connected to $\alpha$ cannot be used to connect to any other vertices; (7) and (7) ensure that there are no cycles in the chosen edges. Constraint (7) ensures that vertices chosen to be in the forcing set must be in the spanning tree instead of connected to $\alpha$ . A solution of Model 7 is a minimum connected forcing set by Theorem 8 and by the fact that MTZ constraints impose connectivity of the selected set of vertices.

Rather than using additional variables, the second method for enforcing connectivity relies on adding valid inequalities which cut off disconnected solutions. These valid inequalities are known as $a$ , $b$ -separation inequalities; the idea behind them is that if a set $C\subset V$ is a vertex cut separating vertices $a$ and $b$ in a graph $G=(V,E)$ , and both $a$ and $b$ are chosen to be in connected zero forcing set, then some vertex from $C$ must also belong to this forcing set. Model 8 gives the complete formulation for connected zero forcing using $a$ , $b$ -separation inequalities. Constraint (8) is the original fort cover constraint from Model 2, and (8) expresses the $a$ , $b$ -separation inequalities.

The $a$ , $b$ -separation inequalities can be separated efficiently using the observation that if the chosen zero forcing set $Z$ is not connected, then the set $C=V\backslash Z$ must be a vertex cut separating at least two vertices $a\in Z$ and $b\in Z$ . However, as was pointed out by Buchanan et al. [19], the resulting vertex cuts in such an implementation are likely not minimal. Since the decision variable for each vertex in $C$ appears in these constraints, the constraints are stronger when the size of the vertex cut $S$ is minimized. Therefore, vertices could be deleted from a vertex cut of $G$ until it becomes inclusion-minimal. For the dominating set problem (which was the focus of [19]), a valid cutting plane can be obtained from a vertex cut; however, a zero forcing set does not have to be dominating. Therefore, we also require that the vertex cut must be an $a$ , $b$ -separator for $a,b\in Z$ . Fischetti et al. [38] give a method for finding $a$ , $b$ -separators between two components in a graph; we implemented and used a similar $a,b$ -separator algorithm which gives a minimal $a,b$ -separator (see also the algorithm of Buchanan et al. [19] for minimal vertex cuts).

Model 8

IP model for Connected Zero Forcing using $a$ , $b$ -separator constraints

[TABLE]

5 Computational Results

This section presents implementation details and computational results for finding minimum zero forcing sets and minimum connected forcing sets using the methods described thus far. In particular, for the zero forcing problem, we compare the

Wavefront algorithm (Algorithm 2),

2.

Infection model (Model 1),

3.

Fort Cover model without checking for facet-inducing forts (Model 2 together with Model 3),

4.

Fort Cover model with simplified checking for facet-inducing forts (Model 2 together with Model 3 and the simplified version of Model 4),

5.

Extended Fort Cover model (Model 5 together with Model 6),

6.

Maximal Closure model without checking for facet-inducing forts (Model 2 with maximal closure method for generation of minimal forts),

7.

Maximal Closure model with simplified checking for facet-inducing forts (Model 2 with maximal closure method for generation of minimal forts and the simplified version of Model 4).

For the connected forcing problem, we compare the

Brute Force algorithm,

2.

Branch-and-Bound algorithm (Algorithm 3),

3.

Fort Cover model with MTZ constraints (Model 7 together with Model 3),

4.

Fort Cover model with $a$ , $b$ -separator constraints (Model 8 together with Model 3).

These methods are respectively labeled Wavefront, Infection, FC no facet, FC w/facet, Ext. Cover, MC no facet, MC w/facet, Brute Force, B&B, MTZ, and $a$ , $b$ -sep in table headings in the next section.

5.1 Implementation Details

Our computational results were obtained on a Dell PowerEdge R330 with an Intel(R) Xeon(R) CPU E3-1270 v5 @ 3.60GHz, 16 GB of RAM, and Red Hat Linux version 4.8.5-11. Integer programs were solved using Gurobi version 7.5.2 set to use a single thread. The Brute Force, Branch-and-Bound, and Wavefront algorithms were implemented in C++ and compiled with g++ version 4.8.5. Implementations of all programs and models used can be obtained at https://github.com/calebfast/zero_forcing.

We tested the different solution approaches on several standard benchmark DIMACS10 [8] (adjnoun, celegansneural, chesapeake, dolphins, football, jazz, karate, lesmis, and polbooks) and IEEE graphs [46] (14-Bus, 24-Bus, 30-Bus, 39-Bus, 57-Bus, 118-Bus, 300-Bus, and 96-RTS), as well as on three classes of random graphs: cubic graphs, WS $(5,0.3)$ graphs, and WS $(10,0.3)$ graphs. We also tested the basic zero forcing methods on a series of star graphs with up to 101 vertices. We used our own C++ implementation to generate random cubic graphs, and we used the connected Watts-Strogatz graph generator from the NetworkX version 1.8.1 package in Python 2.7.6 [29] to generate WS $(5,0.3)$ and WS $(10,0.3)$ graphs; the graphs generated are connected and simple, with no loops or multiple edges. For each family of graphs, we generated five random instances with $10i$ vertices for $i\in\{1,\ldots,10\}$ ; we omitted WS $(10,0.3)$ graphs with 10 vertices since those are just complete graphs. We ran each algorithm for up to two hours. If an integer program could not solve an instance to optimality within two hours, we recorded the lower bound returned by Gurobi and the incumbent solution at the point of timeout (which is an upper bound on the optimal solution).

When solving Model 2 without generation of facet-inducing forts, a maximal set of disjoint minimum size forts was added to the formulation using Model 3 before solving. Other violated fort constraints were added to the model using a MIPSOL callback, which is invoked by Gurobi whenever it finds a new integral incumbent solution. When a violated fort is found, the MIPSOL callback adds that fort to the formulation as a lazy constraint (to enable lazy constraints, the “PreCrush” and “LazyConstraints” parameters of Gurobi were both set to 1). If no more violated forts are found, then Gurobi terminates with an optimal solution.

Similarly, when solving Model 2 with generation of facet-inducing forts, if a generated fort $F$ is not facet inducing, then the valid cut associated with the forts that show $F$ is not facet inducing is added instead of $F$ . Our computational results in Table LABEL:table_times_zf show that while checking for facet-inducing forts sometimes provides a small benefit in average running time and reduces the number of forts that must be generated, it is usually not effective enough to increase the size of the instances that can be solved within 2 hours. Because checking for facet-inducing forts provided no consistent benefit, subsequent models using fort constraints do not check whether the generated forts are facet inducing.

The Maximal Closure model with and without checking for facet-inducing forts, and Model 5, were solved similarly to Model 2. In Model 5, violated minimum border forts were generated using Model 6, and a maximal set of disjoint forts was added to the formulation before solving. In addition, for each $v\in V$ , the fort $V\backslash cl(N[v])$ was also added to the formulation before solving.

Model 7 was solved exactly like Model 2, with the addition of the MTZ constraints. Violated forts are added to the formulation by the MIPSOL callback as lazy constraints; when no more violated forts are found, Gurobi terminates with an optimal solution.

In Model 8, both fort constraints and $a$ , $b$ -separation inequalities are added to the model using a MIPSOL callback. This callback first generates a minimum size violated fort by solving Model 3, and adds it to the formulation as a lazy constraint. When no more violated forts are found, the callback checks whether the current solution is connected. If it is not connected, the $a,b$ -separator algorithm finds a minimal separator contained in the separator given by the vertices outside the current solution; the corresponding $a$ , $b$ -separation inequality is then added to Model 8 as a lazy constraint. When no violated forts are found and the solution is connected, Gurobi terminates with an optimal solution.

For all these models, the parameters not mentioned in the discussion above were left to their defaults in Gurobi. Some testing showed that tuning certain parameters (such as the branching direction (BranchDir), aggressiveness of cut generation (Cuts), or the focus of the solver (MIPFocus)) could improve performance on some specific instances, but not in general. The branching strategy was also left to the Gurobi default.

Tables LABEL:table_times_zf and LABEL:table_times_zf2 give the average runtimes of the different zero forcing algorithms for the graphs tested; Table LABEL:Table:AverageConnectedWS_10_0.3_ZFTimes gives the average runtimes of the different connected forcing algorithms; Table LABEL:throttling_table gives the average runtimes of Model 1 modified to run with a limited number of timesteps. The reported times reflect the time taken by Gurobi to optimize the relevant models; they include the time necessary for setting up the Gurobi models, but do not include the time necessary for data input.

5.2 Comparison and Discussion

The computational results in Tables LABEL:table_times_zf and LABEL:table_times_zf2 show that for the zero forcing problem, the Wavefront algorithm performs best for random cubic graphs and Watts-Strogatz graphs, while the integer programming models are generally faster for real-world graphs corresponding to electrical power grids and other networks. At the bottom of Table LABEL:table_times_zf, we show a case when the IP models perform much better than Wavefront. Similar behavior can be observed in graphs where small subsets of vertices do not force many vertices outside of the subsets; this is the case in some of the IEEE graphs for which Wavefront also has inferior performance. The Wavefront algorithm is also not as easily adapted as the integer programming methods to additional constraints, such as ensuring connectivity or limiting the number of timesteps used.

For random graphs, the Wavefront algorithm is fastest, followed by the Extended Fort Cover model, the standard Fort Cover models, and the Maximal Closure models; all these approaches considerably outperform the Infection model. Facet-inducing constraints usually provide a slight speedup in the Fort Cover and Maximal Closure models, but do not allow any instances to be solved which could not be solved without facet-inducing constraints. All five Fort Cover models can handle similarly sized random cubic and Watts-Strogatz graphs, although the Extended Fort Cover model solves the instances faster on average, especially in sparser graphs.

For small graphs, using minimal fort constraints with the Maximal Closure models generally gives better performance than using minimum fort constraints generated by an IP. This can be explained by the fact that in small graphs it is more likely that a minimal fort is also minimum — hence, we get the benefit of a minimum fort without the cost of solving an IP to find it. However, for larger graphs, minimum forts generally perform better than minimal forts. The Maximal Closure models (with and without facets) do not solve any instances that the other methods cannot solve, but they fail to solve some instances that the other methods can solve; due to this, the Maximal Closure models are omitted from Table LABEL:Table:AverageConnectedWS_10_0.3_ZFTimes.

The differences in runtimes for the different types of graphs indicate that the integer programs are sensitive to the density and vertex degrees of the graphs, while the Wavefront algorithm is less sensitive to these changes, and is primarily affected by size of the graphs. This can be seen by comparing runtimes for cubic graphs with runtimes for Watts-Strogatz graphs, or runtimes for DIMACS graphs with runtimes for IEEE graphs. For example, the DIMACS football instance has roughly the same number of vertices and three times as many edges as the IEEE 118-Bus instance; the latter is solved or closely bounded by the IP models, while the former is not solved by any models, and the bounds have a very wide gap. In Table LABEL:table_times_zf2, for the graphs corresponding to electrical power grids and other real-world networks, the Wavefront algorithm was generally outperformed by the integer programming models, which were mutually competitive in performance.

For the connected forcing problem, the computational results in Table LABEL:Table:AverageConnectedWS_10_0.3_ZFTimes show that the Branch-and-Bound algorithm performs best on small, sparse graphs, but is outperformed by the integer programming models as the size and density of the graphs increases. This is because the Branch-and-Bound algorithm relies on enumerating connected induced subgraphs, and larger, denser graphs have significantly more such subgraphs. The Fort Cover model with $a,b$ -separation constraints solves the largest instances out of any method. This result is in line with other results from the literature, as the $a,b$ -separation constraints have outperformed the MTZ constraints for other problems such as connected domination [19]. For all of the DIMACS10 graphs, the Fort Cover Model with $a$ , $b$ -separation constraints was significantly faster than the model with MTZ constraints and the Branch-and-Bound algorithm; for most other graphs, the three methods were able to handle roughly the same sized graphs. All three of the nontrivial approaches are faster and able to handle larger graphs than the Brute Force approach.

When comparing Models 7 and 8, we see that the MTZ constraints are able to solve similar sizes of instances as the $a,b$ -separation constraints, although the $a,b$ -separation constraints give faster runtimes. Note that as the average degree of the vertices increases, the likelihood that a chosen subset of vertices will induce a connected graph increases. Therefore, for graphs with high average degree, the $a$ , $b$ -separation inequalities were usually not necessary, and the model was solved as a basic zero forcing problem.

When considering the bounds for the IP models for large unsolved instances, we see that the Maximal Closure methods generally give the best bounds, followed by the Infection model. For connected forcing, the bounds given by the MTZ constraints and the $a,b$ -separator constraints are roughly the same. In all cases, the graph density appears to affect the quality of the bounds: the gaps between the upper and lower bounds are smaller for unsolved instances of sparse graphs. In some cases, especially in sparser graphs, the gap between the upper and lower bounds is quite small; for example, in the IEEE 300-Bus graph, the Infection Model had a gap of 2 between the upper and lower bound. Thus, the IP models could sometimes be used to accurately approximate the zero forcing numbers of graphs which are too large for exact computation. However, in large graphs which are denser, such as the DIMACS celegansneural or jazz instances, the gaps between the upper and lower bounds are very large. The combinatorial algorithms we considered — Wavefront, the Brute Force algorithm, and the Branch-and-Bound algorithm — only lend one nontrivial bound (lower bound in the case of Wavefront and brute force, and upper bound in the case of the Branch-and-Bound method) at the point of timeout. As such, they are not as useful as the integer programming models in providing heuristic solutions for large instances. For example, as can be seen from Table LABEL:Table:AverageConnectedWS_10_0.3_ZFTimes, the upper bound given by the Branch-and-Bound algorithm was usually equal $n$ , and was far from the true solution.

Finally, Table LABEL:throttling_table shows computational results for the problem of zero forcing with a bounded number of timesteps. Model 1 runs faster and is able to handle larger graphs when it is coupled with lower bounds on the number of timesteps $T$ . For small values of $T$ , the model solved for all graphs, although it was still somewhat impaired by the graph density. As seen in the $\Delta Z$ columns of Table LABEL:throttling_table, the sizes of the zero forcing sets with bounded number of timesteps approach the sizes of the minimum zero forcing sets as $T$ grows. However, the increase in $T$ also causes increased runtime due to the big- $M$ constraints in the model.

6 Conclusion and Future Work

This paper introduced new methods for computing the zero forcing and connected forcing numbers of graphs. We presented combinatorial algorithms, as well as integer programming formulations based on an infection perspective and a set-covering perspective of zero forcing. We explored several solution strategies for these models, drawing from different areas of integer programming and polyhedral theory, and we compared their performance on random cubic graphs, Watts-Strogatz graphs, and various standard benchmark graphs. Our computational experiments show that the Wavefront algorithm generally outperforms the integer programming models for zero forcing on random cubic graphs and Watts-Strogatz graphs, while the integer programming models are faster for real-world graphs corresponding to electrical power grids and other networks. Moreover, our algorithms for connected forcing were comparable in performance to Wavefront and the zero forcing models (and in some cases they were faster and able to handle slightly larger graphs). We also presented an integer program for finding a set of minimum cardinality which forces a graph within a fixed number of timesteps; this model performed very well for small numbers of timesteps. It would be interesting to extend the Fort Cover models to solve this fixed-timestep problem by adding certain valid inequalities, and compare them against the modified Infection model. It would also be interesting to experiment by varying the implementation of the IP models, e.g., adding a violated cut for every pair of disconnected vertices when using Model 8, or adding violated $a,b$ -separation cuts in each callback rather than when the callback solution satisfies all fort inequalities. Our preliminary tests showed that these variations do not seem to provide a benefit, but they may be beneficial for graph families that were not tested.

Some of the difficulty in solving the zero forcing problem is due to the symmetry of solutions; this symmetry arises from the fact that for each zero forcing set and any associated set of forcing chains, another zero forcing set of equal size can be obtained by choosing the terminals of the forcing chains [11]. Such sets of vertices are nearly indistinguishable in many of the algorithms and formulations, and this symmetry is harder to detect than simple isomporphisms in the graph. Any method for dealing with the symmetry of zero forcing has the potential to drastically improve the performance of the integer programs presented in this paper. Therefore, a direction for future work is to focus on breaking this symmetry. Note that this symmetry is somewhat less prevalent in connected forcing, since the set of terminals of forcing chains associated with a connected forcing set is not always connected; this may be part of the reason why the connected variants of the integer programs sometimes performed better.

Acknowledgements

We thank the five anonymous referees whose helpful and constructive comments greatly improved the presentation and results of the paper. This work was supported by the National Science Foundation, grant numbers 1450681, CMMI-1300477, and CMMI-1404864.

Bibliography61

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Aazami. Domination in graphs with bounded propagation: algorithms, formulations and hardness results. Journal of Combinatorial Optimization , 19(4): 429–456, 2010.
2[2] A. Aazami. Hardness results and approximation algorithms for some problems on graphs. Ph D thesis, University of Waterloo, 2008.
3[3] E. Ackerman, O. Ben-Zwi, and G. Wolfovitz. Combinatorial model and bounds for target set selection. Theoretical Computer Science , 411(44-46): 4017–4022, 2010.
4[4] AIM Special Work Group. Zero forcing sets and the minimum rank of graphs. Linear Algebra and its Applications , 428(7): 1628–1648, 2008.
5[5] D. Amos, Y. Caro, R. Davila, and R. Pepper. Upper bounds on the k 𝑘 k -forcing number of a graph. Discrete Applied Mathematics , 181: 1–10, 2015.
6[6] P. Avella, M. Boccia, and I. Vasilyev. Computational experience with general cutting planes for the set covering problem. Operations Research Letters , 37: 16–20, 2009.
7[7] D. Avis and K. Fukuda. Reverse search for enumeration. Discrete Applied Mathematics , 65: 21–46, 1996.
8[8] D.A. Bader, A. Kappes, H. Meyerhenke, P. Sanders, C. Schulz and D. Wagner. Benchmarking for graph clustering and partitioning. Encyclopedia of Social Network Analysis and Mining , 73–82, 2014. Available at https://www.cc.gatech.edu/dimacs 10/downloads.shtml .