Rearrangement operations on unrooted phylogenetic networks

Remie Janssen; Jonathan Klawitter

arXiv:1906.04468·math.CO·December 24, 2019

Rearrangement operations on unrooted phylogenetic networks

Remie Janssen, Jonathan Klawitter

PDF

TL;DR

This paper explores the properties of spaces of unrooted phylogenetic networks under rearrangement operations like NNI, SPR, and TBR, including connectivity, diameter bounds, and computational complexity of distance measures.

Contribution

It extends known rearrangement operations from trees to networks, analyzing their properties and computational complexity in this broader context.

Findings

01

Proved connectedness of network spaces under these operations

02

Established asymptotic bounds on the diameters of network spaces

03

Showed computing TBR and PR distances is NP-hard

Abstract

Rearrangement operations transform a phylogenetic tree into another one and hence induce a metric on the space of phylogenetic trees. Popular operations for unrooted phylogenetic trees are NNI (nearest neighbour interchange), SPR (subtree prune and regraft), and TBR (tree bisection and reconnection). Recently, these operations have been extended to unrooted phylogenetic networks, which are generalisations of phylogenetic trees that can model reticulated evolutionary relationships. Here, we study global and local properties of spaces of phylogenetic networks under these three operations. In particular, we prove connectedness and asymptotic bounds on the diameters of spaces of different classes of phylogenetic networks, including tree-based and level-k networks. We also examine the behaviour of shortest TBR-sequence between two phylogenetic networks in a class, and whether the…

Tables1

Table 1. Table 1: Connectedness and diameters, if bounded, for the various classes and rearrangement operations. Here m = n + r 𝑚 𝑛 𝑟 m=n+r , P 𝑃 P is a set of phylogenetic networks, and T ∈ u 𝒯 n 𝑇 𝑢 subscript 𝒯 𝑛 T\in u\mathcal{T}_{n} .

class	NNI	PR	TBR
$u 𝒯_{n}$	$Θ (n \log n)$ [LTZ96]	$Θ (n)$ [DGH11]	$Θ (n)$ [DGH11]
$u 𝒩_{n, r}$	$Θ (m \log m)$ T. 5.2	$Θ (m)$ [FHM18, JJE⁺18]	$Θ (m)$ T. 5.6
$u 𝒩_{n}$	✓Corollary 5.3	✓Corollary 5.4	✓Corollary 5.4
$u 𝒩_{n} (P)$	✓Proposition 5.7	✓Proposition 5.7	✓Proposition 5.7
$u 𝒯 ℬ_{n, r} (T)$	$𝒪 (r m)$ Theorem 5.9	$Θ (r)$ Theorem 5.9	$Θ (r)$ Theorem 5.9
$u 𝒯 ℬ_{n, r}$	$𝒪 (r m + n \log n)$ T. 5.10	$Θ (m)$ Theorem 5.10	$Θ (m)$ T. 5.10
$u 𝒯 ℬ_{n} (T)$	✓Theorem 5.10	✓Theorem 5.10	✓Theorem 5.10
$u 𝒯 ℬ_{n}$	✓Theorem 5.10	✓Theorem 5.10	✓Theorem 5.10
$u ℒ 𝒱 - k_{n}$	✓Theorem 5.12	✓Theorem 5.11	✓Theorem 5.11

Equations28

σ = (N = N_{0}, N_{1}, N_{2}, \dots, N_{k} = N^{'})

σ = (N = N_{0}, N_{1}, N_{2}, \dots, N_{k} = N^{'})

d_{TBR} (N, N^{'}) \leq min {d_{TBR} (T, T^{'}) : T \in D (N), T^{'} \in D (N^{'})} + r .

d_{TBR} (N, N^{'}) \leq min {d_{TBR} (T, T^{'}) : T \in D (N), T^{'} \in D (N^{'})} + r .

d_{TBR} (N, N^{'}) \leq r .

d_{TBR} (N, N^{'}) \leq r .

d_{TBR} (T, T^{'}) \leq k .

d_{TBR} (T, T^{'}) \leq k .

d_{TBR} (T, T^{'}) \leq k .

d_{TBR} (T, T^{'}) \leq k .

d_{TBR} (T, T^{'}) \leq r .

d_{TBR} (T, T^{'}) \leq r .

d_{TBR} (T, N) = T^{'} \in D (N) min d_{TBR} (T, T^{'}) + r .

d_{TBR} (T, N) = T^{'} \in D (N) min d_{TBR} (T, T^{'}) + r .

2 (r - 1) .

2 (r - 1) .

3 r + 2 n .

3 r + 2 n .

2 r + 2 n .

2 r + 2 n .

r (1 + lo g r) .

r (1 + lo g r) .

2 n + n lo g n .

2 n + n lo g n .

2 (6 n + 8 r + n lo g n + r lo g r) \in O ((n + r) lo g (n + r)) .

2 (6 n + 8 r + n lo g n + r lo g r) \in O ((n + r) lo g (n + r)) .

n - 3 - ⌊ \frac{n - 2 - 1}{2} ⌋ + r .

n - 3 - ⌊ \frac{n - 2 - 1}{2} ⌋ + r .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Rearrangement operations on unrooted phylogenetic networks

Remie Janssen [

](https://orcid.org/0000-0002-5192-1470)

Delft Institute of Applied Mathematics, Delft University of Technology, Netherlands

Jonathan Klawitter [

](https://orcid.org/0000-0001-8917-5269)

School of Computer Science, University of Auckland, New Zealand

Abstract

Rearrangement operations transform a phylogenetic tree into another one and hence induce a metric on the space of phylogenetic trees. Popular operations for unrooted phylogenetic trees are NNI (nearest neighbour interchange), SPR (subtree prune and regraft), and TBR (tree bisection and reconnection). Recently, these operations have been extended to unrooted phylogenetic networks—generalisations of phylogenetic trees that can model reticulated evolutionary relationships—where they are called NNI, PR, and TBR moves. Here, we study global and local properties of spaces of phylogenetic networks under these three operations. In particular, we prove connectedness and asymptotic bounds on the diameters of spaces of different classes of phylogenetic networks, including tree-based and level- $k$ networks. We also examine the behaviour of shortest TBR-sequence between two phylogenetic networks in a class, and whether the TBR-distance changes if intermediate networks from other classes are allowed: for example, the space of phylogenetic trees is an isometric subgraph of the space of phylogenetic networks under TBR. Lastly, we show that computing the TBR-distance and the PR-distance of two phylogenetic networks is NP-hard.

\EdefEscapeHex

Abstract.1Abstract.1\EdefEscapeHexAbstractAbstract\[email protected]\hyper@anchorend

1 Introduction

Phylogenetic trees and networks are leaf-labelled graphs that are used to visualise and study the evolutionary history of taxa like species, genes, or languages. While phylogenetic trees are used to model tree-like evolutionary histories, the more general phylogenetic networks can be used for taxa whose past includes reticulate events like hybridisation or horizontal gene transfer [SS03, HRS10, Ste16]. Such reticulate events arise in all domains of life [TN05, RW07, MMM*+*17, WWK*+*17]. In some cases, it can be useful to distinguish between rooted and unrooted phylogenetic networks. In a rooted phylogenetic network, the edges are directed from a designated root towards the leaves. Hence, it models evolution along the passing of time. An unrooted phylogenetic network, on the other hand, has undirected edges and thus represent evolutionary relatedness of the taxa. In some cases, unrooted phylogenetic networks can be thought of as rooted phylogenetic networks in which the orientation of the edges has been disregarded. Such unrooted phylogenetic networks are called proper [JJE*+*18, FHM18]. Here we focus on unrooted, binary, proper phylogenetic networks, where binary means that all vertices except for the leaves have degree three. The set of phylogenetic networks on the same taxa can be partitioned into tiers that contain all networks of the same size.

A rearrangement operation transforms a phylogenetic tree into another tree by making a small graph theoretical change. An operation that works locally within the tree is the NNI (nearest neighbour interchange) operation, which changes the order of the four edges incident to an edge $e$ . See for example the NNI from $T_{1}$ to $T_{2}$ in Figure 1. Two further popular rearrangement operations are the SPR (subtree prune and regraft) operation, which as the name suggests prunes (cuts) an edge and then regrafts (attaches) the resulting half edge again, and the TBR (tree bisection and reconnection) operation, which first removes an edge and then adds a new one to reconnect the resulting two smaller trees. See, for example, the SPR from $T_{2}$ to $T_{3}$ and the TBR from $T_{3}$ to $T_{4}$ in Figure 1.

The set of phylogenetic trees on a fixed set of taxa together with a rearrangement operation yields a graph where the vertices are the trees and two trees are adjacent if they can be transformed into each other with the operation. We call this a space of phylogenetic trees. This construction also induces a metric on phylogenetic trees as the distance of two trees is then given as the distance in this space, that is, the minimum number of applications of the operation that are necessary to transform one tree into the other [SOW96]. However, computing the distance of two trees under NNI, SPR, and TBR is NP-hard [DHJ*+*97, HDRCB08, AS01]. Nevertheless, both the space of phylogenetic trees and a metric on them are of importance for the many inference methods for phylogenetic trees that rely on local search strategies [Gus14, SJ17].

Recently, these rearrangement operations have been generalised to phylogenetic networks, both for unrooted networks [HLMW16, HMW16, FHMW18] and for rooted networks[BLS17, FHMW18, GvIJ*+*17, Kla19]. For unrooted networks, Huber et al. [HLMW16] first generalised NNI to level-1 networks, which are phylogenetic networks where all cycles are vertex disjoint. This generalisation includes a horizontal move that changes the topology of the network, like an NNI on a tree, and vertical moves that add or remove a triangle to change the size of the network. Among other results, they then showed that the space of level-1 networks and its tiers are connected under NNI [HLMW16, Theorem 2]. Note that connectedness implies that the distance between any two networks in such a space is finite and that NNI thus induces a metric. This NNI operation was then extended by Huber et al. [HMW16] to work for general unrooted phylogenetic networks. Again, connectedness of the space was proven. Later, Francis et al. [FHMW18] gave lower and upper bounds on the diameter (the maximum distance) of the space of unrooted phylogenetic network of a fixed size under NNI. They also showed that SPR and TBR can straightforwardly be generalised to phylogenetic networks, that the connectedness under NNI implies connectedness under SPR and TBR, and they gave bounds on the diameters. These bounds for SPR were made asymptotically tight by Janssen et al. [JJE*+*18]. Here, we improve these bounds on the diameter under TBR.

There are several generalisations of SPR on rooted phylogenetic trees to rooted phylogenetic networks for which connectedness and diameters have been obtained [BLS17, FHMW18, GvIJ*+*17, JJE*+*18, Jan18]. For example, Bordewich et al. [BLS17] introduced SNPR (subnet prune and regraft), a generalisation of SPR that includes vertical moves, which add or remove an edge. They then proved connectedness under SNPR for the space of rooted phylogenetic networks and for special classes of phylogenetic networks including tree-based networks. Roughly speaking, these are networks that have a spanning tree that is the subdivision of a phylogenetic tree on the same taxa [FS15, FHM18]. Furthermore, Bordewich et al. [BLS17] gave several bounds on the SNPR-distance of two phylogenetic networks. Further bounds and a characterisation of the SNPR-distance of a tree and a network were recently proven by Klawitter and Linz [KL19]. Here, we show that these bounds and characterisation on the SNPR-distance of rooted phylogenetic networks are analogous to the TBR-distance of two unrooted phylogenetic networks.

In this paper, we study spaces of unrooted phylogenetic networks under NNI, PR (prune and regraft), and TBR. Here, the PR and the TBR operation are the generalisation of SPR and TBR on trees, respectively, where vertical moves add or remove an edge like the vertical moves of the SNPR operation in the rooted case. After the preliminary section, we examine the relation of NNI, PR, and TBR; in particular, how a sequence using one of these operations can be transformed into a sequence using another operation (Section 3). We then study properties of shortest paths under TBR in Section 4. This includes the translation of the results from Bordewich et al. [BLS17] and Klawitter and Linz [KL19] on the SNPR-distance of rooted phylogenetic networks to the TBR-distance of unrooted phylogenetic networks. Next, we consider the connectedness and diameters of spaces of phylogenetic networks for different classes of phylogenetic networks, including tree-based networks and level- $k$ networks (Section 5). A subspace of phylogenetic networks (e.g., the space of tree-based networks) is an isometric subgraph of a larger space of phylogenetic networks if, roughly speaking, the distance of two networks is the same in the smaller and the larger space. In Section 6 we study such isometric relations and answer a question by Francis et al. [FHMW18] by showing that the space of phylogenetic trees is an isometric subgraph of the space of phylogenetic networks under TBR. We use this result in Section 7 to show that computing the TBR-distance is NP-hard. In the same section, we also show that computing the PR-distance is NP-hard.

2 Preliminaries

This section provides notation and terminology used in the remainder of the paper. In particular, we define phylogenetic networks and special classes thereof, and rearrangement operations and how they induce distances. Throughout this paper, $X=\{1,2,\ldots,n\}$ denotes a finite set of taxa.

Phylogenetic networks.

An unrooted, binary phylogenetic network $N$ on a set of taxa $X$ is an undirected multigraph such that the leaves are bijectively labelled with $X$ and all non-leaf vertices have degree three. It is called proper if every cut-edge separates two labelled leaves [FHM18], and improper otherwise. This property implies that every edge lies on a path that connects two leaves. More importantly, a network can be rooted at any leaf if and only if it is proper [JJE*+*18, Lemma 4.13]. If not mentioned otherwise, we assume that a phylogenetic network is proper. Furthermore, note that our definition of a phylogenetic network permits the existence of parallel edges in $N$ , i.e., we allow that two distinct edges join the same pair of vertices. An unrooted, binary phylogenetic tree $T$ on $X$ is an unrooted, binary phylogenetic network on $X$ that is a tree.

Let $u\mathcal{N}_{n}$ denote the set of all unrooted, binary proper phylogenetic networks on $X$ and let $u\mathcal{T}_{n}$ denote the set of all unrooted, binary phylogenetic trees on $X$ , where $X=\{1,2,\ldots,n\}$ . To ease reading, we refer to an unrooted, binary proper phylogenetic network (resp. unrooted, binary phylogenetic tree) on $X$ simply as phylogenetic network or network (resp. phylogenetic tree or tree). Figure 2 shows an example of a tree $T\in u\mathcal{T}_{6}$ , a network in $N\in u\mathcal{N}_{6}$ , and an improper network $M$ .

An edge of a network $N$ is an external edge if it is incident to a leaf, and an internal edge otherwise. A cherry $\{a,b\}$ of $N$ is a pair of leaves $a$ and $b$ in $N$ that are adjacent to the same vertex. For example, each network in Figure 2 contains the cherry $\{1,5\}$ .

Tiers.

We say a network $N=(V,E)$ has reticulation number111In graph theory the value $\lvert E\rvert-(\lvert V\rvert-1)$ of a connected graph is also called the cyclomatic number of the graph [Die17]. $r$ for $r=\lvert E\rvert-(\lvert V\rvert-1)$ , that is, the number of edges that have to be deleted from $N$ to obtain a spanning tree of $N$ . For example, the network $N$ in Figure 2 has reticulation number three. Note that a phylogenetic tree is a phylogenetic network with reticulation number zero. Let $u\mathcal{N}_{n,r}$ denote tier $r$ of $u\mathcal{N}_{n}$ , the set of networks in $u\mathcal{N}_{n}$ that have reticulation number $r$ .

Embedding.

Let $G$ be an undirected graph. Subdividing an edge $\{u,v\}$ of $G$ consists of replacing $\{u,v\}$ by a path form $u$ to $v$ that contains at least one edge. A subdivision $G^{*}$ of $G$ is a graph that can be obtained from $G$ by subdividing edges of $G$ . If $G$ has no degree two vertices, there exists a canonical embedding of vertices of $G$ to vertices of $G^{*}$ and of edges of $G$ to paths of $G^{*}$ . Let $N\in u\mathcal{N}_{n}$ . We say $G$ has an embedding into $N$ if there exists a subdivision $G^{*}$ of $G$ that is a subgraph of $N$ such that the embedding maps each labelled vertex of $G^{*}$ to a labelled vertex of $N$ with the same label.

Displaying.

Let $T\in u\mathcal{T}_{n}$ and $N\in u\mathcal{N}_{n}$ . We say $N$ displays $T$ if $T$ has an embedding into $N$ . For example, in Figure 2 the tree $T$ is displayed by both networks $N$ and $M$ . Let $D(N)$ be the set of trees in $u\mathcal{T}_{n}$ that are displayed by $N$ . This notion can be extended to trees with fewer leaves, and to networks. For this, let $M$ be a phylogenetic network on $Y\subseteq X=\{1,\ldots,n\}$ . We say $N$ displays $M$ if $M$ has an embedding into $N$ . Let $P=\{M_{1},\ldots,M_{k}\}$ be a set of phylogenetic networks $M_{i}$ on $Y_{i}\subseteq X=\{1,\ldots,n\}$ . Then let $u\mathcal{N}_{n}(P)$ denote the subset of networks in $u\mathcal{N}_{n}$ that display each network in $P$ .

Tree-based networks.

A phylogenetic network $N\in u\mathcal{N}_{n}$ is a tree-based network if there is a tree $T\in u\mathcal{T}_{n}$ that has an embedding into $N$ as a spanning tree. In other words, there exists a subdivision $T^{*}$ of $T$ that is a spanning tree of $N$ . The tree $T$ is then called a base tree of $N$ . Let $u\mathcal{TB}_{n}$ denote the set of tree-based networks in $u\mathcal{N}_{n}$ . For $T\in u\mathcal{T}_{n}$ , let $u\mathcal{TB}_{n}(T)$ denote the set of tree-based networks in $u\mathcal{TB}_{n}$ with base tree $T$ .

Level- $k$ networks.

A blob $B$ of a network $N\in u\mathcal{N}_{n}$ is a nontrivial two-connected component of $N$ . The level of $B$ is the minimum number of edges that have to be removed from $B$ to make it acyclic. The level of $N$ is the maximum level of all blobs of $N$ . If the level of $N$ is at most $k$ , then $N$ is called a level- $k$ network. Let $u\mathcal{LV}\text{-}k_{n}$ denote the set of level- $k$ networks in $u\mathcal{N}_{n}$ .

$r$ -Burl.

An $r$ -burl is a specific type of blob that we define recursively: a $1$ -burl is the blob consisting of a pair of parallel edges; an $r$ -burl is the blob obtained by placing a pair of parallel edges on one of the parallel edges of an $r-1$ -burl for all $r>1$ . See for example the network $M$ in Figure 3.

$r$ -Handcuffed trees and caterpillars.

Let $T\in u\mathcal{N}_{n}$ and let $a$ and $b$ be two leaves of $T$ . Let $e$ and $f$ be the edges incident to $a$ and $b$ , respectively. Subdivide $e$ and $f$ with vertices $\{u_{1},\ldots,u_{r}\}$ and $\{v_{1},\ldots,v_{r}\}$ , respectively, and add the edges $\{u_{1},v_{1}\},\ldots,\{u_{r},v_{r}\}$ . The resulting network is an $r$ -handcuffed tree $N\in u\mathcal{N}_{n}$ with base tree $T$ on the handcuffed leaves $\{a,b\}$ . Note that $N$ has reticulation number $r$ . If the tree $T$ is a caterpillar and $a$ and $b$ form a cherry of $T$ , then the resulting network $N$ is an $r$ -handcuffed caterpillar. Furthermore, we call an $r$ -handcuffed caterpillar sorted if it is handcuffed on the leafs 1 and 2 and the leafs from 3 to $n$ have a non-decreasing distance to leaf 1. See Figure 3 for an example.

Suboperations.

To define rearrangement operations on phylogenetic networks, we first define several suboperations. Let $G$ be an undirected graph. A degree-two vertex $v$ of $G$ with adjacent vertices $u$ and $w$ gets suppressed by deleting $v$ and its incident edges, and adding the edge $\{u,w\}$ . The reverse of this suppression is the subdivision of $\{u,w\}$ with vertex $v$ .

Let $N\in u\mathcal{N}_{n}$ be a network, and $\{u,v\}$ an edge of $N$ . Then $\{u,v\}$ gets removed by deleting $\{u,v\}$ from $N$ and suppressing any resulting degree-two vertices. We say $\{u,v\}$ gets pruned at $u$ by transforming it into the half edge $\{\cdot,v\}$ and suppressing $u$ if it becomes a degree-two vertex. Note that otherwise $u$ is a leaf. In reverse, we say that a half edge $\{\cdot,v\}$ gets regrafted to an edge $\{x,y\}$ by transforming it into the edge $\{u,v\}$ where $u$ is a new vertex subdividing $\{x,y\}$ .

TBR.

A TBR operation222The TBR operation is known on unrooted phylogenetic trees as tree bisection and reconnection. Since in general networks are not trees and a TBR on a network does not necessarily bisect it, we use TBR now as a word on its own. For the reader who would however like to have an expansion of TBR we suggest "total branch relocation". We welcome other suggestions. is the rearrangement operation that transforms a network $N\in u\mathcal{N}_{n}$ into another network $N^{\prime}\in u\mathcal{N}_{n}$ in one of the following four ways:

(TBR0)

Remove an internal edge $e$ of $N$ , subdivide an edge of the resulting graph with a new vertex $u$ , subdivide an edge of the resulting graph with a new vertex $v$ , and add the edge $\{u,v\}$ ;

or, prune an external edge $e=\{u,v\}$ of $N$ that is incident to leaf $v$ at $u$ , regraft $\{\cdot,v\}$ to an edge of the resulting graph.
(TBR+)

Subdivide an edge of $N$ with a new vertex $u$ , subdivide an edge of the resulting graph with a new vertex $v$ , and add the edge $e=\{u,v\}$ .

(TBR-)

Remove an edge $e$ of $N$ .

Note that a TBR0 can also be seen as the operation that prunes the edge $e=\{u,v\}$ at both $u$ and $v$ and then regrafts both ends. Hence, we say that a TBR0 moves the edge $e$ . Furthermore, we say that a TBR+ adds the edge $e$ and that a TBR- removes the edge $e$ . These operations are illustrated in Figure 4. Note that a TBR0 has an inverse TBR0 and that a TBR+ has an inverse TBR-, and that furthermore a TBR+ increases the reticulation number by one and a TBR- decreases it by one.

Since a TBR operation has to yield a phylogenetic network, there are some restrictions on the edges that can be moved or removed. Firstly, if removing an edge by a TBR0 yields a disconnected graph, then in order to obtain a phylogenetic network an edge has to be added between the two connected components. Similarly, a TBR- cannot remove a cut-edge. Secondly, the suppression of a vertex when removing an edge may not yield a loop $\{u,u\}$ . Thirdly, removing or moving an edge cannot create a cut-edge that does not separate two leaves. Otherwise the network would not be proper.

The TBR0 operation equals the well known TBR (tree bisection and reconnection) operation on unrooted phylogenetic trees [AS01]. The TBR operation on trees has recently been generalised to TBR0 on improper unrooted phylogenetic networks by Francis et al. [FHMW18].

PR.

A PR (prune and regraft) operation is the rearrangement operation that transforms a network $N\in u\mathcal{N}_{n}$ into another network $N^{\prime}\in u\mathcal{N}_{n}$ with a PR+ $=$ TBR+, a PR- $=$ TBR-, or a PR0 that prunes and regrafts an edge $e$ only at one endpoint, instead of at both like a TBR0. Like for TBR, we the say that the PR0/+/- moves/adds/removes the edge $e$ in $N$ . The PR operation is a generalisation of the well-known SPR (subtree prune and regraft) operation on unrooted phylogenetic trees [AS01]. Like for TBR, the generalisation of SPR to PR0 for networks has been introduced by Francis et al. [FHMW18].

NNI.

An NNI (nearest neighbour interchange) operation is a rearrangement operation that transforms a network $N\in u\mathcal{N}_{n}$ into another network $N^{\prime}\in u\mathcal{N}_{n}$ in one of the following three ways:

(NNI0)

Let $e=\{u,v\}$ be an internal edge of $N$ . Prune an edge $f$ ( $f\neq e$ ) at $u$ , and regraft it to an edge $f^{\prime}$ ( $f^{\prime}\neq e$ ) that is incident to $v$ .

(NNI+)

Subdivide two adjacent edges with new vertices $u^{\prime}$ and $v^{\prime}$ , respectively, and add the edge $\{u^{\prime},v^{\prime}\}$ .

(NNI-)

If $N$ contains a triangle, remove an edge of the triangle.

These operations are illustrated in Figure 5. We say that an NNI0 moves the edge $f$ . Alternatively, we call the edge $e$ of an NNI0 the axis of the operation, as the operation can also be defined as pruning $f$ at $u$ , and $f^{\prime\prime}\neq f^{\prime}$ at $v$ , and regrafting $f$ at $v$ and $f^{\prime\prime}$ at $u$ . The NNI operation has been introduced on trees by Robinson [Rob71] and generalised to networks by Huber et al. [HLMW16, HMW16].

Sequences and distances.

Let $N,N^{\prime}\in u\mathcal{N}_{n}$ be two networks. A TBR-sequence from $N$ to $N^{\prime}$ is a sequence

[TABLE]

of phylogenetic networks such that $N_{i}$ can be obtained from $N_{i-1}$ by a single TBR for each $i\in\{1,2,...,k\}$ . The length of $\sigma$ is $k$ . The TBR-distance $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})$ between $N$ and $N^{\prime}$ is the length of a shortest TBR-sequence from $N$ to $N^{\prime}$ , or infinite if no such sequence exists.

Let $\mathcal{C}_{n}$ be a class of phylogenetic networks. The TBR-distance on $\mathcal{C}_{n}$ is defined like on $u\mathcal{N}_{n}$ but with the restriction that every network in a shortest TBR-sequence has to be in $\mathcal{C}_{n}$ . The class $\mathcal{C}_{n}$ is connected under TBR if, for all pairs $N,N^{\prime}\in\mathcal{C}_{n}$ , there exists a TBR-sequence $\sigma$ from $N$ to $N^{\prime}$ such that each network in $\sigma$ is in $\mathcal{C}_{n}$ . Hence, for the TBR-distance to be a metric on $\mathcal{C}_{n}$ , the class has to be connected under TBR and the TBR operation has to be reversible. We already noted above that the latter holds for TBR (and NNI and PR). For a connected class $\mathcal{C}_{n}$ , the diameter is the maximum distance between two of its networks under its metric. The definition for NNI and PR are analogous.

Let $\mathcal{C}_{n}^{\prime}$ be a subclass of $\mathcal{C}_{n}$ . Then $\mathcal{C}_{n}^{\prime}$ is an isometric subgraph of a $\mathcal{C}_{n}$ under, say, TBR if for every $N,N^{\prime}\in\mathcal{C}_{n}^{\prime}$ the TBR-distance of $N$ and $N^{\prime}$ in $\mathcal{C}_{n}^{\prime}$ equals the TBR-distance of $N$ and $N^{\prime}$ in $\mathcal{C}_{n}$ .

3 Relations of rearrangement operations

On trees, it is well known that every NNI is also an SPR, which, in turn, is also a TBR. We observe that the same holds for the generalisations of these operations as defined above.

Observation 3.1.

Let $N\in u\mathcal{N}_{n}$ . Then, on $N$ , every NNI is a PR and every PR is a TBR.

For the reverse direction, we first show that every TBR can be mimicked by at most two PR like in $u\mathcal{T}_{n}$ . Then we show how to substitute a PR with an NNI-sequence.

Lemma 3.2.

Let $N,N^{\prime}\in u\mathcal{N}_{n}$ such that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})=1$ . Then $1\leq\operatorname{d_{\textup{PR}}}(N,N^{\prime})\leq 2$ , where a TBR0 may be replaced by two PR0.

Proof.

If $N^{\prime}$ can be obtained from $N$ by a TBR+ or TBR-, then by the definition of PR+ and PR- it follows that $\operatorname{d_{\textup{PR}}}(N,N^{\prime})=1$ . If $N^{\prime}$ can be obtained from $N$ by a TBR0 that is also a PR0, the statement follows. Assume therefore that $N^{\prime}$ can be obtained from $N$ by a TBR0 that moves the edge $e=\{u,v\}$ of $N$ to $e^{\prime}=\{x,y\}$ of $N^{\prime}$ . Let $G$ be the graph obtained from $N$ by removing $e$ , or equivalently the graph obtained from $N^{\prime}$ by removing $e^{\prime}$ . If $e$ is a cut-edge, then so is $e^{\prime}$ , and without loss of generality $u$ and $x$ as well as $v$ and $y$ subdivide an edge in the same connected components of $G$ . Furthermore, if $u$ subdivides an edge of a pendant blob in $G$ , then so does $x$ . Otherwise $N^{\prime}$ would not be proper. Therefore, the PR0 that prunes $e$ at $u$ and regrafts it to obtain $x$ yields a phylogenetic network $N^{\prime\prime}$ . The choices of $u$ and $x$ ensure that $N^{\prime\prime}$ is connected and proper. There is then a PR0 from $N^{\prime\prime}$ to $N^{\prime}$ that prunes $\{x,v\}$ at $v$ and regrafts it at $y$ to obtain $N^{\prime}$ . Hence, $\operatorname{d_{\textup{PR}}}(N,N^{\prime})\leq 2$ . ∎

Corollary 3.3.

Let $N,N^{\prime}\in u\mathcal{N}_{n}$ . Then $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})\leq\operatorname{d_{\textup{PR}}}(N,N^{\prime})\leq 2\operatorname{d_{\textup{TBR}}}(N,N^{\prime})$ .

Lemma 3.4.

*Let $N,N^{\prime}\in u\mathcal{N}_{n,r}$ such that there is a PR0 that transforms $N$ into $N^{\prime}$ . Let $e$ be the edge of $N$ pruned by this PR0.

Then there exists an NNI0-sequence from $N$ to $N^{\prime}$ that only moves $e$ and whose length is in $\mathcal{O}(n+r)$ . Moreover, if neither $N$ nor $N^{\prime}$ contains parallel edges, then neither does any intermediate networks in the NNI-sequence.*

Proof.

Assume that $N$ can be transformed into $N^{\prime}$ by pruning the edge $e=\{u,v\}$ at $u$ and regrafting it to $f=\{x,y\}$ . Note that there is then a (shortest) path $P=(u=v_{0},v_{1},v_{2},\ldots,v_{k}=x)$ from $u$ to $x$ in $N\setminus\{e\}$ , since otherwise $N^{\prime}$ would be disconnected. Without loss of generality, assume that $P$ does not contain $y$ . Furthermore, assume for now that $P$ does not contain $v$ . The idea is now to move $e$ along $P$ to $f$ with NNI0. In particular, we show how to construct a sequence $\sigma=(N=N_{0},N_{1},\ldots,N_{k}=N^{\prime})$ such that either $N_{i+1}$ can be obtained from $N_{i}$ by an NNI0 or $N_{i+1}=N_{i}$ , and such that $N_{i}$ contains the edge $e_{i}=\{v_{i},v\}$ . This process is illustrated in Figure 6. Assume we have constructed the sequence up to $N_{i}$ . Let $g=\{v_{i+1},w\}$ with $w\neq v$ be the edge incident to $v_{i+1}$ that is not on $P$ . Obtain $N_{i+1}$ from $N_{i}$ by swapping $e_{i}$ and $g$ with an NNI0 on the axis $\{v_{i},v_{i+1}\}$ . Note that this preserves the path $P$ and that $N_{i+1}$ may only contain a parallel edge if $N$ or $N^{\prime}$ contains parallel edges. As a result, we get $N_{k}=N^{\prime}$ .

It remains to show that every network in $\sigma$ is proper. Assume otherwise and let $N_{i+1}$ be the first improper network in $\sigma$ . Then $N_{i+1}$ contains a cut-edge $e_{c}$ that separates a blob $B$ from all leaves. We claim that $e_{c}$ is part of $P$ . Indeed, the pruning of the NNI0 from $N_{i}$ to $N_{i+1}$ has to create $B$ and the regrafting cannot be to $B$ , so it has to pass along $e_{c}$ (Figure 7). However, as $P$ is a path, the moving edge cannot pass $e_{c}$ again, so all networks $N_{j}$ for $j>i$ including $N^{\prime}$ are improper; a contradiction. Hence, all intermediate networks $N_{i}$ are proper and thus $\sigma$ is an NNI0-sequence from $N$ to $N^{\prime}$ .

Next, assume that $P$ contains $v_{i}=v$ . Then first apply the process above to move $v$ of $\{u,v\}$ along $P^{\prime}=(v=v_{i},v_{i+1},\ldots,v_{k})$ to $v_{k}$ . In the resulting network, apply the process above to move $u$ of $\{u,v\}=\{u,v_{k}\}$ along $P^{\prime\prime}=(u=v_{0},v_{1},\ldots,v_{i})$ to $v_{i}$ . The process again avoids the creation of a network $N_{j}$ with parallel edges, if neither $N$ nor $N^{\prime}$ contains parallel edges. Furthermore, from Figure 7 we get that if $\sigma$ would contain improper network then $u$ would be contained in the blob $B$ . However, then $\{u,v\}$ and $e_{c}$ would be edges from $B$ to the rest of the network; again a contradiction.

Lastly, note that the length of $P$ is in $\mathcal{O}(n+r)$ since $N$ contains only $2n+3r-1$ edges. Hence, the length of $\sigma$ is also in $\mathcal{O}(n+r)$ . ∎

Lemma 3.5.

*Let $n\geq 3$ . Let $N,N^{\prime}\in u\mathcal{N}_{n}$ such that there is a PR- that transforms $N$ into $N^{\prime}$ . Let $e$ be the edge of $N$ removed by this PR-. Let $N$ have reticulation number $r$ .

Then, there is an NNI0-sequence followed by one NNI- that transforms $N$ and $N^{\prime}$ by only moving and removing $e$ and whose length is in $\mathcal{O}(n+r)$ . Moreover, if neither $N$ nor $N^{\prime}$ contains parallel edges, then neither do the intermediate networks in the NNI-sequence.*

Proof.

Assume the PR- removes $e=\{u,v\}$ from $N$ to obtain $N^{\prime}$ . If $e$ is part of a triangle, the PR- move is an NNI- move. If $e$ is a parallel edge, then move either $u$ or $v$ with an NNI0 to obtain a network with a triangle that contains $e$ . Then the previous case applies. So assume otherwise, namely that $e$ is not part of a triangle or a pair of parallel edges. Then move $u$ with an NNI0-sequence closer to $v$ to form a triangle as follows.

Because removing $e$ in $N$ yields the proper network $N^{\prime}$ , it follows that $N\setminus\{e\}$ contains a shortest path $P$ from $u$ to $v$ . Since $e$ is not part of a triangle, this path must contain at least two nodes other than $u$ and $v$ . Let $\{x,y\}$ and $\{y,v\}$ be the last two edges on $P$ . Consider the PR0 that prunes $\{u,v\}$ at $u$ and regrafts it to $\{x,y\}$ . Note that this creates a triangle on the vertices $y$ , $u$ and $v$ . By Lemma 3.4 we can replace this PR0 with an NNI0-sequence. Lastly, we can remove $\{u,v\}$ with an NNI- to obtain $N^{\prime}$ . The bound on the length of the NNI-sequence as well as the second statement follow from Lemma 3.4. ∎

To conclude this section, we note that all previous results combined show that we can replace a TBR-sequence with a PR-sequence, which we can further replace with an NNI-sequence. For several connectedness results in Section 5 this allows us to focus on TBR and then derive results for NNI and PR.

4 Shortest paths

In this section, we focus on bounds on the distance between two specified networks. We restrict to the TBR-distance in $u\mathcal{N}_{n}$ and in $u\mathcal{N}_{n,r}$ , and study the structure of shortest sequences of moves. We make several observations about these sequences in general, and some about shortest sequences between two networks that have certain structure in common, e.g., common displayed networks. Hence, we get bounds on the TBR-distance between two networks, and we uncover properties of the spaces of phylogenetic networks which allow for reductions of the search space. For example, if $N$ and $N^{\prime}$ have reticulation number $r$ , no shortest path from $N$ to $N^{\prime}$ contains a network with a reticulation number less than $r$ . The proof of this statement relies on the following observation about the order in which TBR0 and TBR+ operations can occur in a shortest path.

Observation 4.1.

Let $N,N^{\prime}\in u\mathcal{N}_{n,r}$ such that there exists a TBR-sequence $\sigma_{0}=(N,M,N^{\prime})$ that uses a TBR+ and a TBR-. Then there is a TBR0 that transforms $N$ into $N^{\prime}$ .

Rephrasing 4.1, a TBR+ followed by a TBR-, or vice versa, can be replaced by a TBR0. This case can thus not occur in a shortest TBR-sequence. Next, we look at a TBR0 followed by a TBR+.

Lemma 4.2.

*Let $N,N^{\prime}\in u\mathcal{N}_{n}$ with reticulation number $r$ and $r+1$ such that there exists a shortest TBR-sequence $\sigma_{0}=(N,M,N^{\prime})$ that starts with a TBR0.

Then there is a TBR-sequence $\sigma_{+}=(N,M^{\prime},N^{\prime})$ that starts with a TBR+.*

Proof.

Note that the TBR0 from $N$ to $M$ of $\sigma_{0}$ can be replaced with a sequence consisting of a TBR+ followed by a TBR-. This TBR- and the TBR+ from $M$ to $N^{\prime}$ can now be combined to a TBR0, which gives us a sequence $\sigma_{+}$ . ∎

Let $N,N^{\prime}\in u\mathcal{N}_{n,r}$ and consider a shortest TBR-sequences from $N$ to $N^{\prime}$ that contains TBR+ and TBR- operations. If the reverse statement of Lemma 4.2 would also hold, then we could shuffle the sequence such that consecutive TBR+ and TBR- can be replaced with a TBR0. This would imply that $u\mathcal{N}_{n,r}$ is an isometric subgraph of $u\mathcal{N}_{n}$ under TBR. However, we now show that the reverse statement of Lemma 4.2 does not hold in general, and, hence, adjacent operations of different types in a shortest TBR-sequence cannot always be swapped.

Lemma 4.3.

*Let $n\geq 4$ and $r\geq 2$ . Let $N,N^{\prime}\in u\mathcal{N}_{n}$ with reticulation number $r$ and $r+1$ such that there exists a shortest TBR-sequence $\sigma_{+}=(N,M^{\prime},N^{\prime})$ that starts with a TBR+.

Then it is not guaranteed that there is a TBR-sequence $\sigma_{0}=(N,M,N^{\prime})$ that starts with a TBR0.*

Proof.

We claim that the networks $N$ and $N^{\prime}$ in Figure 8 are a pair of networks for which no TBR-sequence $\sigma_{0}=(N,M,N^{\prime})$ exists that starts with a TBR0. The two networks $M_{1}$ and $M_{2}$ in Figure 8 are the only two TBR- neighbours of $N^{\prime}$ . However, it is easy to check that the TBR0-distance of $N$ and $M_{i}$ , $i\in\{1,2\}$ , is at least two. Hence, a shortest TBR sequence from $N$ to $N^{\prime}$ that starts with a TBR0 has length three and so $\sigma_{0}$ cannot exist. Note that we can add an edge to each of the pair of parallel edges to obtain an example without parallel edges. Moreover, the example can be extended to higher $n$ and $r$ by adding extra leaves between leaf 3 and 4, and replacing a pair of parallel edges by a chain of parallel edges in each network. ∎

Note that the TBR0 used in Figure 8 to prove Lemma 4.3 is a PR0. Hence, the statement of Lemma 4.3 also holds for PR. On the positive side, if one of the two networks is a tree, then we can swap the TBR+ with the TBR0.

Lemma 4.4.

*Let $T\in u\mathcal{T}_{n}$ and $N\in u\mathcal{N}_{n}$ with reticulation number one such that there exists a shortest TBR-sequence $\sigma_{+}=(T,N^{\prime},N)$ that starts with a TBR+.

Then there is a TBR-sequence $\sigma_{0}=(T,T^{\prime},N)$ that starts with a TBR0.*

Proof.

We show how to obtain $\sigma_{0}$ from $\sigma_{+}$ . Suppose that $N^{\prime}$ is obtained from $T$ by adding the edge $f$ and that $N$ is obtained from $N^{\prime}$ by removing $e^{\prime}$ and adding $e$ . Note that $f$ is an edge of the cycle $C$ in $N^{\prime}$ . Furthermore, $e^{\prime}$ and $f$ are distinct. Indeed, otherwise there would be a shorter TBR-sequence from $T$ to $N$ that simply adds $e$ to $T$ .

Assume for now that $e^{\prime}$ is an edge of $C$ in $N^{\prime}$ . Then, $e^{\prime}$ can be removed with a TBR- from $N^{\prime}$ to obtain a tree $T^{\prime}$ . Hence, the TBR+ from $T$ to $N^{\prime}$ and the TBR- from $N^{\prime}$ to $T^{\prime}$ can be merged into a TBR0 from $T$ to $T^{\prime}$ . Furthermore, the edge $e$ can then be added to $T^{\prime}$ with a TBR+ to obtain $N$ . This yields the sequence $\sigma_{0}$ .

Next, assume that $e^{\prime}$ is not an edge of $C$ in $N^{\prime}$ . Then, $e^{\prime}$ is a cut-edge in $N^{\prime}$ and $e$ is a cut-edge in $N$ . Let $\bar{e}$ be the edge of $T$ that equals $e^{\prime}$ , if it exists, or the edge that gets subdivided by $f$ into $e^{\prime}$ and another edge. Let $\bar{f}$ be the edge of $N$ defined as follows: it is equal to $f$ itself if $f$ is not touched by the TBR0 move from $N^{\prime}$ to $N$ ; it is the extension of $f$ if one of its endpoints is suppressed by this move; it is one of the two edges obtained by subdividing $f$ . Now let $T^{\prime}$ be a tree obtained by removing $\bar{f}$ from $N$ . Then, there is a TBR0 from $T$ to $T^{\prime}$ that moves $\bar{e}$ to $\bar{e}^{\prime}$ and furthermore a TBR+ that adds $\bar{f}$ to $T^{\prime}$ and yields $N$ . We obtain again $\sigma_{0}$ . An example is given in Figure 9. ∎

Next, we look at shortest paths between a tree and a network. First, we show that if a network displays a tree, then there is a simple TBR--sequence from the network to the tree. Recall that $D(N)$ is the set of trees in $u\mathcal{T}_{n}$ displayed by $N\in u\mathcal{N}_{n}$ . This result is the unrooted analogous to Lemma 7.4 by Bordewich et al. [BLS17] on rooted phylogenetic networks.

Lemma 4.5.

*Let $N\in u\mathcal{N}_{n,r}$ and $T\in u\mathcal{T}_{n}$ .

Then $T\in D(N)$ if and only if $\operatorname{d_{\textup{TBR}}}(T,N)=r$ , that is, iff there exists a TBR--sequence of length $r$ from $N$ to $T$ .*

Proof.

Note that $\operatorname{d_{\textup{TBR}}}(T,N)\geq r$ , since a TBR can reduce the reticulation number by at most one. Furthermore, if we apply a sequence of $r$ TBR- moves on $N$ , we arrive at a tree that is displayed by $N$ . Hence, if $T\not\in D(N)$ , then $\operatorname{d_{\textup{TBR}}}(T,N)>r$ .

We now use induction on $r$ to show that $\operatorname{d_{\textup{TBR}}}(T,N)\leq r$ if $T\in D(N)$ . If $r=0$ , then $T=N$ and the inequality holds. Now suppose that $r>0$ and that the statement holds whenever a network with a reticulation number less than $r$ displays $T$ . Fix an embedding of $T$ into $N$ and colour all edges of $N$ not covered by this embedding green. Note that removing a green edge with a TBR- might result in an improper network or a loop. Therefore, we have to show that there is always at least one edge that can be removed such that the resulting graph is a phylogenetic network. For this, consider the subgraph $H$ of $N$ induced by the green edges. If $H$ contains a component consisting of a single green edge $e$ , then removing $e$ from $N$ with a TBR- yields a network $N^{\prime}$ . If $H$ contains a tree component $S$ , then it is easy to see that removing an external edge of $S$ from $N$ with a TBR- yields a network $N^{\prime}$ . Otherwise, as $N$ is proper, a component $S$ displays a tree $T_{S}$ whose external edges cover exactly the external edges of $S$ . We can then apply the same case distinction to the edges of $S$ not covered by $T_{S}$ and either directly find an edge to remove or find further trees that cover the smaller remaining components. Since $S$ is finite, we eventually find an edge to remove. The induction hypothesis then applies to $N^{\prime}$ . This concludes the proof. ∎

Note that the proof of Lemma 4.5 also works if $T$ is a network displayed by $N$ . Hence, we get the following corollary.

Corollary 4.6.

*Let $N\in u\mathcal{N}_{n,r}$ and let $N^{\prime}\in u\mathcal{N}_{n,r^{\prime}}$ such that $N^{\prime}$ is displayed by $N$ .

Then $\operatorname{d_{\textup{TBR}}}(N^{\prime},N)=r-r^{\prime}$ , that is, there exists a TBR--sequence of length $r-r^{\prime}$ from $N$ to $N^{\prime}$ .*

Lemma 4.5 and Corollary 4.6 now allow us to construct TBR-sequences between networks that go down tiers and then come up again. In fact, for rooted networks this can sometimes be necessary as Klawitter and Linz have shown [KL19, Lemma 13]. However, we now show that this is never necessary for TBR on unrooted networks.

Lemma 4.7.

*Let $N,N^{\prime}\in u\mathcal{N}_{n}$ .

Then in no shortest TBR-sequence from $N$ to $N^{\prime}$ does a TBR- precede a TBR+.*

Proof.

Consider a minimal counterexample with $N,N^{\prime}\in u\mathcal{N}_{n}$ such that there exists a shortest TBR-sequence $\sigma$ from $N$ to $N^{\prime}$ that uses exactly one TBR- and TBR+ and that starts with this TBR-. If $\sigma$ uses TBR0 operations between the TBR- and the TBR+, then, by Lemma 4.2, we can swap the TBR+ forward until it directly follows the TBR-. However, then we can obtain a TBR-sequence shorter than $\sigma$ by combining the TBR- and the TBR+ into a TBR0 by 4.1; a contradiction. ∎

Combining Lemmas 4.5, 4.6 and 4.2, we easily derive the following two corollaries about short sequences that do not go down tiers before going back up again.

Corollary 4.8.

Let $N,N^{\prime}\in u\mathcal{N}_{n}$ with reticulation number $r$ and $r^{\prime}$ , with $r\geq r^{\prime}$ . Then

[TABLE]

Corollary 4.9.

Let $N,N^{\prime}\in u\mathcal{N}_{n}$ with reticulation number $r$ and $r^{\prime}$ , and $r\geq r^{\prime}$ . Let $T\in u\mathcal{T}_{n}$ such that $T\in D(N),D(N^{\prime})$ . Then

[TABLE]

Both Corollaries 4.8 and 4.9 can easily be proven by first finding a sequence that goes down to tier 0 and back up to tier $r$ , and then combining the $r^{\prime}$ TBR- with $r^{\prime}$ TBR+ into $r^{\prime}$ TBR0 using Lemma 4.2.

The following lemma is the unrooted analogue to Proposition 7.7 by Bordewichet al. [BLS17]. We closely follow their proof.

Lemma 4.10.

*Let $N,N^{\prime}\in u\mathcal{N}_{n}$ such that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})=k$ . Let $T\in D(N)$ .

Then there exists a $T^{\prime}\in D(N)$ such that*

[TABLE]

Proof.

The proof is by induction on $k$ . If $k=0$ , then the statement trivially holds. Suppose that $k=1$ . If $T\in D(N^{\prime})$ , then set $T^{\prime}=T$ , and we have $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})=0\leq 1$ . So assume otherwise, namely that $T\not\in D(N^{\prime})$ . Note that that if $N^{\prime}$ has been obtained from $N$ by a TBR+, then $N^{\prime}$ displays $T$ . Therefore, distinguish whether $N^{\prime}$ has been obtained from $N$ by a TBR0 or TBR- $\sigma$ .

Suppose that $N^{\prime}$ has been obtained from $N$ by a TBR0 that moves the edge $e=\{u,v\}$ of $N$ . Fix an embedding $S$ of $T$ into $N$ . Since $N^{\prime}$ does not display $T$ , the edge $e$ is covered by $S$ . Let $\bar{e}$ be the edge of $T$ that gets mapped to the path of $S$ that covers $e$ . Let $S_{1}$ and $S_{2}$ be the subgraphs of $S\setminus\{e\}$ . Note that $S_{1},S_{2}$ have embeddings into $N$ and $N^{\prime}$ . Now, if in $N$ there exists a path $P$ from the embedding of $S_{1}$ to the embedding of $S_{2}$ that avoids $e$ , then the graph consisting of $P$ , $S_{1}$ , and $S_{2}$ is a tree $T^{\prime}$ displayed by $N^{\prime}$ . Otherwise $e$ is a cut-edge of $N$ and the TBR0 moves $e$ to an edge $e^{\prime}$ connecting the two components of $N\setminus\{e\}$ . Then in $N^{\prime}$ there is path $P$ from the embedding of $S_{1}$ to the embedding of $S_{2}$ in $N^{\prime}$ . Together they form an embedding of a tree $T^{\prime}$ displayed by $N^{\prime}$ . In both cases $T^{\prime}$ can also be obtained from $T$ by moving $\bar{e}$ to where $P$ attaches to $S_{1}$ and $S_{2}$ . If $N^{\prime}$ is obtained from $N$ by a TBR-, then the first case has to apply.

Now suppose that $k\geq 2$ and that the hypothesis holds for any two networks with TBR-distance at most $k-1$ . Let $N^{\prime\prime}\in u\mathcal{N}_{n}$ such that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime\prime})=k-1$ and $\operatorname{d_{\textup{TBR}}}(N^{\prime\prime},N^{\prime})=1$ . Thus by induction there are trees $T^{\prime\prime}$ and $T^{\prime}$ such that $T^{\prime\prime}\in D(N^{\prime\prime})$ with $\operatorname{d_{\textup{TBR}}}(T,T^{\prime\prime})\leq k-1$ and $T^{\prime}\in D(N^{\prime})$ with $\operatorname{d_{\textup{TBR}}}(T^{\prime\prime},T^{\prime})\leq 1$ . It follows that $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})\leq k$ , thereby completing the proof of the lemma. ∎

By setting one of the two networks in the previous lemma to be a phylogenetic tree and noting that the roles of $N$ and $N^{\prime}$ are interchangeable, the next two corollaries are immediate consequences of Lemmas 4.5 and 4.10.

Corollary 4.11.

Let $T\in u\mathcal{T}_{n}$ , $N\in u\mathcal{N}_{n,r}$ such that $\operatorname{d_{\textup{TBR}}}(T,N)=k$ . Then for every $T^{\prime}\in D(N)$

[TABLE]

Corollary 4.12.

Let $N\in u\mathcal{N}_{n,r}$ and let $T,T^{\prime}\in D(N)$ . Then

[TABLE]

The following theorem is the unrooted analogous of Theorem 7 by Klawitter and Linz [KL19] and their proof can be applied straightforward by swapping SNPR and rooted networks with TBR and unrooted networks, and by using Lemmas 4.5 and 4.10 and Theorem 6.1.

Theorem 4.13.

Let $T\in u\mathcal{T}_{n}$ and let $N\in u\mathcal{N}_{n,r}$ . Then

[TABLE]

5 Connectedness and diameters

Whereas in the previous section we studied the maximum distance between two given networks, here, we focus on global connectivity properties of several classes of phylogenetic networks under NNI, PR, and TBR. These results imply that these operations induce metrics on these spaces. For each connected metric space, we can ask about its diameter. Since a class of phylogenetic networks that contains networks with unbounded reticulation number naturally has an unbounded diameter, this questions is mainly of interest for the tiers of a class. First, we recall some known results from unrooted phylogenetic trees.

Theorem 5.1 (Li et al.[LTZ96], Ding et al.[DGH11]).

The space $u\mathcal{T}_{n}$ is connected under

$\bullet$

NNI0* with the diameter in $\Theta(n\log n)$ ,*

$\bullet$

PR0* with the diameter in $n-\Theta(\sqrt{n})$ , and*

$\bullet$

TBR0* with the diameter in $n-\Theta(\sqrt{n})$ .*

5.1 Network space

Huber et al. [HMW16, Theorem 5] proved that the space of phylogenetic networks that includes improper networks is connected under NNI. We reprove this for our definition of $u\mathcal{N}_{n}$ , but first look at the tiers of this space.

Theorem 5.2.

*Let $n\geq 0$ , $r\geq 0$ , and $m=n+r$ .

Then $u\mathcal{N}_{n,r}$ is connected under NNI with the diameter in $\Theta(m\log m)$ .*

Proof.

Let $N\in u\mathcal{N}_{n,r}$ and let $T\in u\mathcal{T}_{n}$ be a tree displayed by $N$ . We show that $N$ can be transformed into a sorted $r$ -handcuffed caterpillar $N^{*}$ with $\mathcal{O}(m\log m)$ NNI. Our process is as follows and illustrated in Figure 10.

Step 1.

Transform $N$ into a network $N_{T}$ that is tree-based on $T$ .

Step 2.

Transform $N_{T}$ into handcuffed tree $N_{H}$ on the leafs 1 and 2.

Step 3.

Transform $N_{H}$ into a sorted handcuffed caterpillar $N^{*}$ .

We now describe this process in detail. For Step 1, we show how to construct an NNI0-sequence $\sigma$ from $N$ to $N_{T}$ , and we give a bound on the length of $\sigma$ . Let $S$ be an embedding of $T$ into $N$ , that is, $S$ is a subdivision of $T$ and a subgraph of $N$ . Colour all edges of $N$ used by $S$ black and all other edges green. Note that this yields green, connected subgraphs $G_{1},\ldots,G_{l}$ of $N$ ; more precisely, the $G_{i}$ are the connected components of the graph induced by the green edges of $N$ . Note that each $G_{i}$ has at least two vertices in $S$ , since otherwise $N$ would not be proper. Furthermore, if each $G_{i}$ consists of a single edge, then $N$ is tree-based on $T$ . Assuming otherwise, we show how to break the $G_{i}$ apart.

First, if there is a triangle on vertices $v_{1},u,v_{2}$ where $v_{1}$ and $v_{2}$ are adjacent vertices in $S$ and $u$ is their neighbour in $G_{i}$ , then change the embedding of $S$ (and $T$ ) so that it takes the path $v_{1},u,v_{2}$ instead of $v_{1},v_{2}$ (see Figure 11a). Otherwise, there is an edge $\{v,u\}$ where $v$ is in $S$ and the other vertices adjacent to $u$ are not adjacent to $v$ . Let $\{u,w_{1}\}$ and $\{u,w_{2}\}$ be the other edges incident to $u$ . Apply an NNI0 to move $\{u,w_{1}\}$ to $S$ as in Figure 11b. Note that each such NNI0 decreases the number of vertices in green subgraphs and increases the number of vertices in $S$ . Furthermore, the resulting networks is clearly proper. Therefore, repeat these cases until all $G_{i}$ consist of single edges. Let the resulting graph be $N_{T}$ . Since there are at most $2(r-1)$ vertices in all green subgraphs that are not in $S$ , the number of required NNI0 for Step 1 is at most

[TABLE]

In Step 2 we transform $N_{T}$ into a handcuffed tree $N_{H}$ on the leaves 1 and 2. Let $M=\{\{u_{1},v_{1}\},\{u_{2},v_{2}\},\ldots,\{u_{r},v_{r}\}\}$ be the set of green edges in $N_{T}$ , that is, the edges that are not in the embedding $S$ of $T$ into $N_{T}$ . Without loss of generality, assume that for $i\in\{1,\ldots,r\}$ the distance between $u_{i}$ and leaf $1$ in $S$ is at most the distance of $v_{i}$ to leaf $1$ in $S$ . The idea is to sweep along the edges of $S$ to move the $u_{i}$ towards leaf $1$ and then do the same for the $v_{i}$ towards leaf $2$ .

For an edge $e$ of $T$ , let $P_{e}$ be the path of $S$ corresponding to $e$ . Let $e_{1}$ be the edge of $T$ incident to leaf $1$ . Impose directions on the edges of $T$ towards leaf $1$ . Do the same for the edges of $S$ accordingly. This gives a partial order $\preceq$ on the edges of $T$ with $e_{1}$ as maximum. Let $\prec$ be a linear extension of $\preceq$ on the edges of $T$ .

Let $e=(x,y)$ be the minimum of $\prec$ . Let $P_{e}=(x,\ldots,y)$ be the corresponding path in $S$ . From $x$ to $y$ along $P_{e}$ , proceed as follows.

(i)

If there is an edge $(u_{i},v_{l})$ in $P_{e}$ , then swap $u_{i}$ and $v_{l}$ with an NNI0. 2. (ii)

If there is an edge $(u_{i},u_{j})$ in $P_{e}$ then move the $u_{j}$ endpoint of the green edge incident to $u_{j}$ onto the green edge incident to $u_{i}$ with an NNI0. 3. (iii)

Otherwise, if there is an edge $(u_{i},y)$ in $P_{e}$ , then move $u_{i}$ beyond $y$ .

This is illustrated in Figure 12. Informally speaking, we stack $u_{j}$ onto $u_{i}$ so they can move together towards $e_{1}$ . Repeat this process for each edge in the order given by $\prec$ . For the last edge $e_{1}$ , ignore case (iii). Next “unpack” the stacked $u_{i}$ ’s on $e_{1}$ .

We now count the number of NNI0 needed. Firstly, each $v_{l}$ is swapped at most once with a $u_{i}$ . Secondly, each $u_{j}$ is moving to and from a green edge at most once. Furthermore, each vertex of $S$ corresponding to a vertex of $T$ is swapped at most twice. Hence, the total number of NNI0 required is at most

[TABLE]

Repeat this process for the $v_{i}$ towards leaf $2$ . Since the $v_{i}$ do not have to be swapped with $u_{j}$ , the total number of NNI0 required for this is at most

[TABLE]

Note that the resulting network may not yet be a handcuffed tree as the order of the $u_{i}$ and $v_{j}$ may be different. Hence, lastly in Step 2, to obtain $N_{H}$ sort the edges with the mergesort-like algorithm by Li et al. [LTZ96, Lemma 2]. They show that the required number of NNI0 for this is at most

[TABLE]

For Step 3, consider the path $P$ in $S$ from leaf $1$ to $2$ . If $P$ contains only one pendant subtree, then $N_{H}$ is handcuffed on the cherry $\{1,2\}$ . Otherwise, use NNI0 to reduce it to one pendant subtree. This takes at most $n$ NNI0. Next, transform the pendant subtree of $P$ into a caterpillar to obtain a handcuffed caterpillar, again with at most $n$ NNI0. Lastly, sort the leaves with the algorithm from Li et al. [LTZ96, Lemma 2] to obtain the sorted handcuffed caterpillar $N^{*}$ . The required number of NNI0 to get from $N_{H}$ to $N^{*}$ is at most

[TABLE]

Since we can transform any network $N\in u\mathcal{N}_{n,r}$ into $N^{*}$ , it follows that $u\mathcal{N}_{n,r}$ is connected under NNI. Furthermore, adding Equations 1 to 5 up and multiplying the result by two shows that the diameter of $u\mathcal{N}_{n,r}$ under NNI0 is at most

[TABLE]

Francis et al. [FHMW18, Theorem 2] gave the lower bound $\Omega(m\log m)$ on the diameter of tier $r$ of the space that allows improper networks under NNI ${}^{0}_{\text{improper}}$ (NNI0 without the properness condition). Their proof consists of two parts: a lower bound on the total number of networks in a tier $\lvert u\mathcal{N}_{n,r}\rvert$ , and upper bounds on the number of networks that can be reached from one network for each fixed number of NNI ${}^{0}_{\text{improper}}$ . The diameter of $u\mathcal{N}_{n,r}$ is at least the smallest number of moves needed for which previously mentioned upper bound is greater than the lower bound on $\lvert u\mathcal{N}_{n,r}\rvert$ .

Our version of NNI0 is stricter than theirs as we do not allow improper networks. Hence, the number of networks that can be reached with a fixed number of NNI0 is at most the number of networks that can be reached with the same number of NNI ${}^{0}_{\text{improper}}$ . Furthermore, their lower bound on $\lvert u\mathcal{N}_{n,r}\rvert$ is found by counting the number of Echidna networks, a class of networks only containing proper networks. Combining these two observations, we see that their lower bound for the diameter of $u\mathcal{N}_{n,r}$ under NNI ${}^{0}_{\text{improper}}$ is also a lower bound for $u\mathcal{N}_{n,r}$ under NNI0. ∎

From Theorem 5.2 we get the following corollary.

Corollary 5.3.

The space $u\mathcal{N}_{n}$ is connected under NNI with unbounded diameter.

Since, by 3.1, every NNI is also a PR and TBR, the statements in Theorem 5.2 and Corollary 5.3 also hold for PR and TBR. This observation has been made before by Francis et al.[FHMW18] for tiers of the space of networks that allow improper networks.

Corollary 5.4.

The spaces $u\mathcal{N}_{n}$ and $u\mathcal{N}_{n,r}$ are connected under the PR and TBR operation.

We now look at the diameters of $u\mathcal{N}_{n,r}$ under PR and TBR.

Theorem 5.5.

*Let $n\geq 0$ , $r\geq 0$ .

Then the diameter of $u\mathcal{N}_{n,r}$ under PR0 is in $\Theta(n+r)$ with the upper bound $n+2r$ .*

Proof.

The asymptotic lower bound was proven by Francis et al. [FHMW18, Proposition 4]. Concerning an upper bound, Janssen et al. [JJE*+*18, Theorem 4.22] showed that the distance of two improper networks $M$ and $M^{\prime}$ under PR is at most $n+\frac{8}{3}r$ , of which $\frac{2}{3}r$ PR0 moves are used to transform $M$ and $M^{\prime}$ into proper networks $N$ and $N^{\prime}$ . Hence, the PR-distance of $N$ and $N^{\prime}$ is at most $n+2r$ . ∎

Theorem 5.6.

*Let $n\geq 0$ , $r\geq 0$ .

Then the diameter of $u\mathcal{N}_{n,r}$ under TBR is in $\Theta(n+r)$ with the upper bound*

[TABLE]

Proof.

Like for PR, the lower bound was proven by Francis et al. [FHMW18, Proposition 4]. In Corollary 4.8 we show that the TBR-distance of two networks $N$ and $N^{\prime}\in u\mathcal{N}_{n,r}$ that display a tree $T$ and $T^{\prime}\in u\mathcal{T}_{n}$ , respectively, is at most $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})+r$ . Since $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})\leq n-3-\lfloor\frac{\sqrt{n-2}-1}{2}\rfloor$ by Theorem 1.1 of Ding et al. [DGH11] it follows that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})\leq n-3-\lfloor\frac{\sqrt{n-2}-1}{2}\rfloor+r$ . ∎

5.2 Networks displaying networks

Bordewich [Bor03, Proposition 2.9] and Mark et al. [MMS16] showed that the space of rooted phylogenetic trees that display a set of triplets (trees on three leaves) is connected under NNI. Furthermore, Bordewich et al. [BLS17] showed that the space of rooted phylogenetic networks that display a set of rooted phylogenetic trees is connected. We give a general result for unrooted phylogenetic networks that display a set of networks. For this, we will use Lemma 4.5, which, as we recall, guarantees that if a network $N\in u\mathcal{N}_{n,r}$ displays a tree $T\in u\mathcal{T}_{n}$ , then there is a sequence of $r$ TBR- from $N$ to $T$ .

Proposition 5.7.

*Let $P=\{P_{1},...,P_{k}\}$ be a set of $k$ phylogenetic networks $P_{i}$ on $Y_{i}\subseteq X=\{1,\ldots,n\}$ .

Then $u\mathcal{N}_{n}(P)$ is connected under NNI, PR, and TBR.*

Proof.

Define the network $N_{P}\in u\mathcal{N}_{n}(P)$ as follows. Let $P_{0}\in u\mathcal{T}_{n}$ be the caterpillar where the leaves are ordered from $1$ to $n$ ; that is, $P_{0}$ contains a path $(v_{2},v_{3},\ldots,v_{n-1})$ such that leaf $i$ is incident to $v_{i}$ , leaf $1$ is incident to $v_{2}$ , and leaf $n$ is incident to $v_{n-1}$ . Let $e_{i}$ be the edge incident to leaf $i$ in $P_{0}$ . Subdivide $e_{i}$ with $k$ vertices $u_{i}^{1},\ldots,u_{i}^{k}$ . Now, for $P_{j}\in P$ , $j\in\{1,\ldots,k\}$ , identify leaf $i$ of $P_{j}$ with $u_{i}^{j}$ of $P_{0}$ and remove its label $i$ . Finally, in the resulting network suppress any degree two vertex. This is necessary if one or more of the $P_{j}$ have fewer than $n$ leaves. The resulting network $N_{P}$ now displays all networks in $P$ . An example is given in Figure 13.

Let $N\in u\mathcal{N}_{n}(P)$ . Construct a TBR-sequence from $N$ to $N_{P}$ by, roughly speaking, building a copy of $N_{P}$ attached to $N$ , and then removing the original parts of $N$ . First, add $P_{0}$ to $N$ by adding an edge $e=\{v_{1},v_{2}\}$ from the edge incident to leaf 1 to the edge incident to leaf 2 with a TBR+. Then add another edge from $e$ to the edge incident to leaf 3, and so on up to leaf $n$ . Colour all newly added edges and the edges incident to the leaves blue, and all other edges red. Note that the blue edges now give an embedding of $P_{0}$ into the current network. Now, ignoring all red edges, it is straight forward to add the $P_{j}$ , $j\in\{1,\ldots,k\}$ one after the other with TBR+ such that the resulting network displays $N_{P}$ . For example, one could start by adding a tree displayed by $P_{j}$ and then adding any other edges. The first part works similar to the construction of $P_{0}$ and the second part is possible by Lemma 4.5. Lastly, remove all red edges with TBR- such that every intermediate network is proper. This is again possible by Lemma 4.5 and yields the network $N_{P}$ . Note that in the first two stages the red edges (plus external edges) display $P$ and in the last phase the non-red edges display $P$ .

Since we only used TBR+ and TBR- operations, the statement also holds for PR. For NNI, by Lemma 3.5 we can replace each of these operations that add or remove an edge $e$ by NNI-sequences that only move and remove or add the edge $e$ . Hence, the statement also holds for NNI. ∎

For the following corollary, note that a quartet is an unrooted binary tree on four leaves and a quarnet is an unrooted binary, level-1 network on four leaves [HMSW18].

Corollary 5.8.

Let $X=\{1,...,n\}$ . Let $P$ be a set of phylogenetic trees on $X$ , a set of quartets on $X$ , or a set of quarnets on $X$ . Then $u\mathcal{N}_{n}(P)$ is connected under NNI, PR, and TBR.

5.3 Tree-based networks

A related but more restrictive concept to displaying a tree is being tree-based. So, next, we consider the class of tree-based networks. We start with the tiers of $u\mathcal{TB}_{n}(T)$ , which is the set of tree-based networks that have the tree $T$ as base tree.

Theorem 5.9.

Let $T\in u\mathcal{T}_{n}$ . Then the space $u\mathcal{TB}_{n,r}(T)$ is connected under

$\bullet$

TBR* with the diameter being between $\lceil\frac{r}{3}\rceil$ and $r$ ,*

$\bullet$

PR* with the diameter being between $\lceil\frac{r}{2}\rceil$ and $2r$ , and*

$\bullet$

NNI* with the diameter being in $\mathcal{O}(r(n+r))$ .*

Proof.

We start with the proof for TBR. Let $N,N^{\prime}\in u\mathcal{TB}_{n,r}(T)$ . Consider embeddings of $T$ into $N$ and $N^{\prime}$ . Let $S=\{e_{1},\ldots,e_{r}\}$ and $S^{\prime}=\{e_{1}^{\prime},\ldots,e_{r}^{\prime}\}$ be the set of all edges not covered by this embedding of $T$ in $N$ and in $N^{\prime}$ . Since $N$ is tree-based, $S$ and $S^{\prime}$ consist of vertex-disjoint edges. Following the embeddings of $T$ into $N$ and $N^{\prime}$ , it is straightforward to move each edge $e_{i}$ with a TBR0 from $N$ to where $e_{i}^{\prime}$ is in $N^{\prime}$ . In total, this requires at most $r$ TBR0. Since every intermediate network is clearly in $u\mathcal{TB}_{n,r}(T)$ , this gives connectedness of $u\mathcal{TB}_{n,r}(T)$ and an upper bound of $r$ on the diameter. For the lower bound, consider a network $M$ with $r$ pairs of parallel edges and $M^{\prime}$ without any. Observe that a TBR0 can break at most three pairs of parallel edges and that only if a pair of parallel edges is removed and attached to two other pairs of parallel edge. Hence, for these particular $N$ and $N^{\prime}$ we have that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})\geq\lceil\frac{r}{3}\rceil$ .

The constructed TBR0-sequence for $N$ to $N^{\prime}$ above can be converted straightforwardly into a PR0-sequence from $N$ to $N^{\prime}$ of length at most $2r$ . For the lower bound, let $M$ and $M^{\prime}$ be as above and note that a PR can break at most two pairs of parallel edges. Hence, $\operatorname{d_{\textup{PR}}}(M,M^{\prime})\geq\lceil\frac{r}{2}\rceil$ .

By Lemma 3.4, the PR-sequence can be used to construct an NNI-sequence from $N$ to $N^{\prime}$ that only moves the edges $e_{i}$ along paths of the embedding of $T$ . Since the PR-sequence has length at most $2r$ and each PR can be replaced by an NNI sequence of length at most $\mathcal{O}(n+r)$ , this gives the upper bound of $\mathcal{O}(r(n+r))$ on the diameter of $u\mathcal{TB}_{n,r}(T)$ under NNI. ∎

We use Theorem 5.9 to prove connectedness of other spaces of tree-based networks.

Theorem 5.10.

*Let $T\in u\mathcal{T}_{n}$ .

Then the spaces $u\mathcal{TB}_{n}(T)$ , $u\mathcal{TB}_{n,r}$ , and $u\mathcal{TB}_{n}$ are each connected under TBR, PR, and NNI. Moreover, the diameter of $u\mathcal{TB}_{n,r}$ is in $\Theta(n+r)$ under TBR and PR and in $\mathcal{O}(n\log n+r(n+r))$ under NNI.*

Proof.

Assume without loss of generality that $T$ has the cherry $\{1,2\}$ . First, let $N$ and $N^{\prime}$ be in tiers $r$ and $r^{\prime}$ of $u\mathcal{TB}_{n}(T)$ , respectively, such that they are $r$ - and $r^{\prime}$ -handcuffed on the cherry $\{1,2\}$ . Then $\operatorname{d_{\textup{NNI}}}(N,N^{\prime})=\lvert r^{\prime}-r\rvert$ , as we can decrease the number of handcuffs with NNI-. Since, by Theorem 5.9, the tiers of $u\mathcal{TB}_{n,r}(T)$ are connected, the connectedness of $u\mathcal{TB}_{n}(T)$ follows.

Second, let $N,N^{\prime}\in u\mathcal{TB}_{n,r}$ be tree-based networks on $T$ and $T^{\prime}$ respectively, and with an $r$ -burl on the edge incident to leaf $1$ . Ignoring the burls, by Theorem 5.1, $N$ can be transformed into $N^{\prime}$ by transforming $T$ into $T^{\prime}$ with $\mathcal{O}(n\log n)$ NNI0 or with $\mathcal{O}(n)$ PR0 or TBR0. With Theorem 5.9, the connectedness of $u\mathcal{TB}_{n,r}$ and the upper bounds on the diameter follow. The lower bound on the diameter under PR and TBR also follows from Theorem 5.1 and Theorem 5.9,

Lastly, the connectedness of $u\mathcal{TB}_{n}$ follows similarly from the connectedness of $u\mathcal{T}_{n}$ and $u\mathcal{TB}_{n,r}$ . ∎

5.4 Level- $k$ networks

To conclude this section, we prove the connectedness of the space of level- $k$ networks.

Theorem 5.11.

*Let $n\geq 2$ and $k\geq 1$ .

Then, the space $u\mathcal{LV}\text{-}k_{n}$ is connected under TBR and PR with unbounded diameter.*

Proof.

Let $N\in u\mathcal{LV}\text{-}k_{n}$ and $T\in u\mathcal{T}_{n}$ . We show that $N$ can be transformed into the network $M\in u\mathcal{LV}\text{-}k_{n}$ that can be obtained from $T$ by adding a $k$ -burl to the edge incident to leaf $1$ . First, create a $k$ -burl in $N$ on the edge incident to leaf $1$ . This can be done using $k$ PR+. Next, using Lemma 4.5 remove all other blobs. This gives a network $M^{\prime}$ which consists of a tree $T^{\prime}$ with a $k$ -burl at leaf $1$ . There is a PR0-sequence from $T^{\prime}$ to $T$ , which is easily converted into a sequence from $M^{\prime}$ to $M$ . This proves the connectedness of $u\mathcal{LV}\text{-}k_{n}$ under PR and also TBR. Lastly, note that the diameter is unbounded because the number of possible reticulations in a level- $k$ network is unbounded. ∎

Note that an NNI+ cannot directly create a pair of parallel edges. We may instead add a triangle with an NNI+ and then use an NNI0 to transform it into a pair of parallel edges. However, adding the triangle within a level- $k$ blob of a level- $k$ network, then adding the triangle would increase the level. Therefore, to prove connectedness of level- $k$ networks under NNI we use the same idea as for PR but are more careful to not increase the level.

Theorem 5.12.

*Let $n\geq 3$ and $k\geq 1$ .

Then, the space $u\mathcal{LV}\text{-}k_{n}$ is connected under NNI with unbounded diameter.*

Proof.

Let $N\in u\mathcal{LV}\text{-}k_{n}$ and let $T\in u\mathcal{T}_{n}$ . Like in the proof of Theorem 5.11, we want to transform $N$ into a network $M$ obtained from $T$ by adding a $k$ -burl to the edge incident to leaf $1$ .

Let $B$ be a level- $k$ blob of $N$ . Assume that $N$ contains another blob $B^{\prime}$ . By Lemma 4.5 there is a PR+-sequence that can remove $B^{\prime}$ . Use Lemma 3.5 to substitute this sequence with an NNI-sequence that reduces $B^{\prime}$ to a level-1 blob. Note that this can be done locally within blob $B^{\prime}$ and its incident edges. Therefore, this process does not increase the level of a network along this sequence. If $B^{\prime}$ is now a cycle of size at least three, then we can shrink it to a triangle, if necessary, and remove it with an NNI-. If $B^{\prime}$ is a pair of parallel edges and one of its vertices is incident to a degree three vertex $v$ that is not part of a level- $k$ blob, then use an NNI0 to increase the size of $B^{\prime}$ into a triangle by including $v$ or merge it with the blob containing $v$ . Next, either remove the resulting triangle, or repeat the process above to remove the new blob. Otherwise, ignore $B^{\prime}$ for now and continue with another blob of the current network that is neither $B^{\prime}$ nor $B$ . When this process terminates, we arrive at a network that has only blob $B$ , and, potentially, pairs of parallel edges that are incident to both $B$ and a leaf. That is the case since a pair of parallel edges incident to a degree three vertex not in $B$ could be removed with an NNI0 and an NNI-.

If the edge incident to leaf $1$ contains a pair of parallel edges or is incident to a degree three vertex not in $B$ , then use $k-1$ NNI+ and NNI0 (or $k$ in the latter case) to create a $k$ -burl next to leaf $1$ . Otherwise, if $B$ is incident to three or more cut-edges, then one of them is not incident to leaf $1$ and can be moved to the edge incident to leaf $1$ with an NNI0-sequence. If $B$ is incident to two or fewer cut-edges, there is a vertex incident to three cut edges (since $n\geq 3$ ) and one of them can be moved to the edge incident to leaf $1$ with an NNI0-sequence. Then apply the first case again to create a $k$ -burl. Finally, remove $B$ and any remaining pair of parallel edges. This gives a network $M^{\prime}$ which consists of a tree $T^{\prime}$ with a $k$ -burl at leaf $1$ . There is an NNI0-sequence from $T^{\prime}$ to $T$ , which is easily converted into a sequence from $M^{\prime}$ to $M$ . Lastly, note that the diameter is unbounded because for each $r\geq 0$ , there is a level- $k$ network with $r$ reticulations. ∎

6 Isometric relations between spaces

Recall that a space $\mathcal{C}_{n}$ is an isometric subgraph of $u\mathcal{N}_{n}$ under a rearrangement operation, say TBR, if the TBR-distance of two networks in $\mathcal{C}_{n}$ is the same as their TBR-distance in $u\mathcal{N}_{n}$ . In this section, we investigate this question for $u\mathcal{T}_{n}$ under TBR, and for tree-based networks and level-k networks under TBR and PR.

We start with $u\mathcal{T}_{n}$ . The proof of the following theorem follows the proof by Bordewich et al. [BLS17, Proposition 7.1] for their equivalent statement for SNPR on rooted phylogenetic trees and networks closely.

Theorem 6.1.

The space $u\mathcal{T}_{n}$ is an isometric subgraph of $u\mathcal{N}_{n}$ under TBR. Moreover, every shortest TBR-sequence from $T\in u\mathcal{T}_{n}$ to $T^{\prime}\in u\mathcal{T}_{n}$ only uses TBR0.

Proof.

Let $\operatorname{d}_{\mathcal{T}}$ and $\operatorname{d}_{\mathcal{N}}$ be the TBR-distance in $u\mathcal{T}_{n}$ and $u\mathcal{N}_{n}$ respectively. To prove the statement, it suffices to show that $\operatorname{d}_{\mathcal{T}}(T,T^{\prime})=\operatorname{d}_{\mathcal{N}}(T,T^{\prime})$ for every pair $T,T^{\prime}\in u\mathcal{T}_{n}$ . Note that $\operatorname{d}_{\mathcal{T}}(T,T^{\prime})\geq\operatorname{d}_{\mathcal{N}}(T,T^{\prime})$ holds by definition. To prove the converse, let $\sigma=(T=N_{0},N_{1},\ldots,N_{k}=T^{\prime})$ be a shortest TBR-sequence from $T$ to $T^{\prime}$ . Consider the following colouring of the edges of each $N_{i}$ , for $i\in\{0,\ldots,k\}$ . Colour all edges of $T=N_{0}$ blue. For $i\in\{1,\ldots,k\}$ preserve the colouring of $N_{i-1}$ to a colouring of $N_{i}$ for all edges except those affected by the TBR. In particular, an edge that gets added or moved is coloured red, an edge resulting from a vertex suppression is coloured blue if the two merged edges were blue and red otherwise, and the edges resulting from an edge subdivision are coloured like the subdivided edge.

Let $F_{i}$ be the graph obtained from $N_{i}$ by removing all red edges. We claim that $F_{i}$ is a forest with at most $k+1$ components. Since $F_{0}=T$ , the statement holds for $i=0$ . If $N_{i}$ is obtained from $N_{i-1}$ by a TBR+, then $F_{i}=F_{i-1}$ . If $N_{i}$ is obtained from $N_{i-1}$ by a TBR0 or TBR-, then at most one component gets split. Note that $F_{k}$ is a so-called agreement forest for $T$ and $T^{\prime}$ and thus $\operatorname{d}_{\mathcal{T}}(T,T^{\prime})\leq k=\operatorname{d}_{\mathcal{N}}(T,T^{\prime})$ by Theorem 2.13 by Allen and Steel [AS01]. Furthermore, if $\sigma$ would use a TBR+, then the forest $F_{k}$ would contain at most $k$ components. However, then $\operatorname{d}_{\mathcal{T}}(T,T^{\prime})<k$ ; a contradiction. ∎

Francis et al. [FHMW18] gave the example in Figure 14 to show that the tiers $u\mathcal{N}_{n,r}$ for $n\geq 5$ and $r>0$ are not isometric subgraphs of $u\mathcal{N}_{n}$ under NNI. Their question of whether tier zero, $u\mathcal{T}_{n}$ , is an isometric subgraph of $u\mathcal{N}_{n}$ under NNI remains open.

Lemma 6.2.

Let $n\geq 5$ and $r\geq 0$ . Then the space $u\mathcal{N}_{n,r}$ is not an isometric subgraph of $u\mathcal{N}_{n}$ under NNI.

Lemma 6.3.

For $n=4$ and $r=13$ the space $u\mathcal{N}_{n,r}$ is not an isometric subgraph of $u\mathcal{N}_{n}$ under PR.

Proof.

For the networks $N$ and $N^{\prime}$ in $u\mathcal{N}_{n,r}$ shown in Figure 15 there is a length three PR-sequence that traverses tier $r+1$ , for example, like the depicted sequence $\sigma=(N=N_{0},N_{1},N_{2},$ $N_{3}=N^{\prime})$ . To prove the statement we show that every PR0-sequence from $N$ to $N^{\prime}$ has length at least four.

The networks $N$ and $N^{\prime}$ contain the highlighted (sub)blobs $B_{1}$ , $B_{2}$ , (resp. $B_{1}^{\prime}$ and $B_{2}^{\prime}$ ), $B_{3}$ , and $B_{4}$ . Observe that the edges between $B_{1}$ and $B_{2}$ and between $B_{3}$ and $B_{4}$ may only be pruned from a blob by a PR0 if they get regrafted to the same blob again. Otherwise the resulting network is improper. Note that to derive $B_{1}^{\prime}$ from $B_{1}$ an edge has to be regrafted to the “top” of $B_{1}$ and the edge to $B_{2}$ has to be pruned. By the first observation, combining these into one PR0 cannot build the connection to $B_{3}$ . The same applies for the transformation of $B_{2}$ into $B_{2}^{\prime}$ and its connection to $B_{4}$ . Therefore, we either need four PR0 to derive $B_{1}^{\prime}$ and $B_{2}^{\prime}$ or two PR0 plus two PR0 to build the connections to $B_{3}$ and $B_{4}$ . In conclusion, at least four PR0 are required to transform $N$ into $N^{\prime}$ , which concludes this proof. ∎

By replacing a leaf with a tree, and adding more pairs of parallel edges to edge leading to $4$ , this example can be made to work for $n\geq 4$ and $r\geq 13$ .

Theorem 6.4.

For $n\geq 6$ the space $u\mathcal{TB}_{n}$ is not an isometric subgraph of $u\mathcal{N}_{n}$ under TBR and PR.

Proof.

Let $N$ be the network in Figure 16. Let $N^{\prime}$ be the network derived from $N$ by swapping the labels $1$ and $2$ . Note that $\operatorname{d_{\textup{TBR}}}(N,N^{\prime})=\operatorname{d_{\textup{PR}}}(N,N^{\prime})=2$ , since, from $N$ to $N^{\prime}$ , we can move leaf 2 next to leaf 1 and then move leaf 1 to where leaf 2 was. However, then the network in the middle is not tree-based, since the blob derived from the Petersen graph has no Hamiltonian path if the two pendent edges of the blob are next to each other [FHM18]. We claim that there is no other length two TBR-sequence from $N$ to $N^{\prime}$ . For this proof we call a blob derived from the Petersen graph a Petersen blob.

First, note that the TBR0-sequence of $N$ and $N^{\prime}$ is at least two and there is thus no TBR-sequence that consists of a TBR- and a TBR+. Otherwise, these two operations could be merged into a single TBR0 by 4.1. Note that we can only move leaf 1 or 2 by pruning an incident edge if we do not affect the split 1 versus 2, 3 or break the tree-based property. Therefore, they either have to be swapped using edges of the Petersen blobs or the $(4,5,6)$ -chain has to be reversed and leaf 3 moved to the other Petersen blob. However, it is straightforward to check that neither can be done with two TBR0. In particular, we can look at what edge the first TBR0 might move and then check whether a second TBR0 can arrive at $N^{\prime}$ . If the first TBR0 breaks a Petersen blob, the problem is that the second TBR0 has to restore it. We then find that this does not allows us to make the initially planned changes to arrive at $N^{\prime}$ . On the other hand, if we avoid breaking the Petersen blob and reverse the $(4,5,6)$ -chain, then leaf 3 is still on the wrong side; and if we move leaf 3 to the other Petersen blob, then not enough TBR0 moves remain to reverse the chain.

Since there is no other length two TBR0-sequence there is also no other length two PR-sequence. ∎

Theorem 6.5.

For $n\geq 5$ and large enough $k$ , the space $u\mathcal{LV}\text{-}k_{n}$ is not an isometric subgraph of $u\mathcal{N}_{n}$ under TBR and PR.

Proof.

For even $k$ , the networks $N$ and $N^{\prime}$ in Figure 17 have TBR- and PR-distance two via the network $M$ . However, note that in $M$ the blobs of size $\frac{k}{2}+1$ a $\frac{k}{2}$ are merged into a blob of size $k+1$ . Therefore, $M$ is not a level- $k$ network. We claim that there is no TBR- or PR-sequence of length two that does not go through a level- $(k+1)$ network like $M$ . An example for odd $k$ can be derived from this.

It is easy to see that the TBR-distance of $N$ and $N^{\prime}$ is at least two and there is thus no TBR-sequence that consists of a TBR- and a TBR+. Otherwise, these two operations could be merged into a single TBR0 by 4.1. We thus have to prove that there is no length two TBR0-sequence from $N$ to $N^{\prime}$ that avoids a level- $(k+1)$ network. Note that it requires two TBR0 (or PR0) to connect $B_{2}$ and $B_{3}$ into $B_{2}^{\prime}$ . Similarly, it requires either two prunings from the upper five-cycle of $B_{2}$ to obtain the triangle $B_{3}^{\prime}$ or one pruning within that cycle. However, in the latter option this would not contribute to connecting $B_{2}$ and $B_{3}$ and hence overall at least three operations would be needed. Therefore we have to combine the two operations necessary to create $B_{2}^{\prime}$ and to create $B_{3}^{\prime}$ , which however gives us a sequence like the one shown in Figure 17. ∎

Note that the results of this section that show that the spaces of tree-based networks and level- $k$ networks are not isometric subgraphs of the space of all networks also hold if we restrict these spaces to a particular tier $r$ (for large enough $r$ ).

7 Computational complexity

In this section, we consider the computational complexity of computing the TBR-distance and the PR-distance. First, we recall the known results on phylogenetic trees.

Theorem 7.1 ([DHJ+97, HDRCB08, AS01]).

Computing the distance of two trees in $u\mathcal{T}_{n}$ is NP-hard for the NNI-distance, the SPR-distance, and the TBR-distance.

In Theorem 6.1, we have shown that $u\mathcal{T}_{n}$ is an isometric subgraph of $u\mathcal{N}_{n}$ under TBR. Hence, with Theorem 7.1, we get the following corollary.

Corollary 7.2.

Computing the TBR-distance of two arbitrary networks in $u\mathcal{N}_{n}$ is NP-hard.

We can use the same two theorems to prove that computing the TBR-distance in tiers is also hard.

Theorem 7.3.

Computing the TBR-distance of two arbitrary networks in $u\mathcal{N}_{n,r}$ is NP-hard.

Proof.

We (linear-time) reduce the NP-hard problem of computing the TBR-distance of two trees in $u\mathcal{T}_{n}$ to computing the TBR-distance of two networks in $u\mathcal{N}_{n+1,r}$ . For this, let $T,T^{\prime}\in u\mathcal{T}_{n}$ . Let $e$ be the edge incident to leaf $n$ of $T$ . Obtain $S$ from $T$ by subdividing $e$ with a new vertex $u$ and adding the edge $\{u,v\}$ where $v$ is a new vertex labelled $n+1$ . Next, add $r$ handcuffs to the cherry $\{n,n+1\}$ to obtain the network $N\in u\mathcal{N}_{n+1,r}$ . Analogously obtain $N^{\prime}$ from $T^{\prime}$ .

The equality $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})=\operatorname{d_{\textup{TBR}}}(N,N^{\prime})$ follows from Lemma 4.10, and the fact that networks handcuffed at a cherry display exactly one tree. More precisely, a TBR-sequence between $T$ and $T^{\prime}$ induces a TBR-sequence of the same length between $N$ and $N^{\prime}$ , hence $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})\geq\operatorname{d_{\textup{TBR}}}(N,N^{\prime})$ . Conversely, by Lemma 4.10 and the fact that $D(N)=\{T\}$ and $D(N^{\prime})=\{T^{\prime}\}$ , it follows that $\operatorname{d_{\textup{TBR}}}(T,T^{\prime})\leq\operatorname{d_{\textup{TBR}}}(N,N^{\prime})$ . Since computing the TBR-distance in $u\mathcal{T}_{n}$ is NP-hard, the statement follows. ∎

To prove that computing the PR-distance is hard, we use a different reduction. Van Iersel et al. prove that deciding whether a tree is displayed by a (not necessarily proper) phylogenetic network (Unrooted Tree Containment; UTC) is NP-hard [VIKS*+*18]. Combining this with Lemma 4.5, we arrive at our result.

Theorem 7.4.

Computing the PR-distance of two arbitrary networks in $u\mathcal{N}_{n}$ is NP-hard.

Proof.

We reduce from UTC to the problem of computing the PR-distance of two networks in $u\mathcal{N}_{n}$ . Let $(N,T)$ with $N$ a (not necessarily proper) network and $T\in u\mathcal{T}_{n}$ be an arbitrary instance of UTC. We obtain an instance $(N^{\prime},T^{\prime},r^{\prime})$ of the PR-distance decision problem as follows: remove all cut-edges of $N$ that do not separate two labelled leaves, and let $N^{\prime\prime}$ be the connected component containing all the leaves; now, let $N^{\prime}$ be the proper network obtained from $N^{\prime\prime}$ by suppressing all degree two nodes. The instance of the PR-distance decision problem consists of $N^{\prime}$ , $T^{\prime}=T$ , and the reticulation number $r^{\prime}$ of $N^{\prime}$ . As we can compute in polynomial time whether a cut edge separates two labelled leaves, the reduction is polynomial time. Because a displayed tree uses only cut-edges that separate two labelled leaves, $T$ is displayed by $N$ if and only if it is displayed by $N^{\prime}$ . By Lemma 4.5, $T$ is a displayed tree of $N$ , if and only if $\operatorname{d_{\textup{PR}}}(N^{\prime},T^{\prime})\leq r$ , which concludes the proof. ∎

Unlike for the hardness proof of TBR-distance, we cannot readily adapt this proof to the PR-distance in $u\mathcal{N}_{n,r}$ . For this purpose, we need to learn more about the structure of PR-space.

8 Concluding remarks

In this paper, we investigated basic properties of spaces of unrooted phylogenetic networks and their metrics under the rearrangement operations NNI, PR, and TBR. We have proven connectedness and bounds on diameters for different classes of phylogenetic networks, including networks that display a particular set of trees, tree-based networks, and level- $k$ networks. Although these parameters have been studied before for classes of rooted phylogenetic network [BLS17], this is the first paper that studies these properties for classes of unrooted phylogenetic networks besides the space of all networks. A summary of our results is shown in Table 1.

To see the improvements in diameter bounds, we compare our results to previously found bounds: For the space of phylogenetic trees $u\mathcal{T}_{n}$ it was known that the diameter is asymptotically linearithmic and linear in the size of the trees under NNI and SPR/TBR [LTZ96, DGH11], respectively. Here, we have shown that the diameter under NNI is also asymptotically linearithmic for higher tiers of phylogenetic networks. Whether this also holds in the rooted case is still open. We have further (re)proven the asymptotic linear diameter for PR and TBR of these tiers and, in particular, improved the upper bound on the diameter under TBR to $n-3-\lfloor\frac{\sqrt{n-2}-1}{2}\rfloor+r$ from the previously best bound $n+2r$ [JJE*+*18].

To uncover local structures of network spaces, we looked at properties of shortest sequences of moves between two networks. Here we found that shortest TBR-sequences between networks in the same tier never traverse lower tiers, and shortest TBR-sequences between trees also never traverse higher tiers. This implies that $u\mathcal{T}_{n}$ is an isometric subgraph of $u\mathcal{N}_{n}$ , and that computing the TBR-distance between two networks in $u\mathcal{N}_{n}$ is NP-hard. This answers a question by Francis et al. [FHMW18]. We have attempted to prove similar results for other subspaces and rearrangement moves. However, for higher tiers, we have not been able to prove that shortest TBR-sequences never traverse higher tiers. To answer this question we may need to utilise agreement graphs such as frequently used for phylogenetic trees [AS01, BS05] and, more recently, also for rooted phylogenetic networks [KL19, Kla19]. Concerning NNI and PR we gave counterexamples to prove that higher tiers are not isometric subgraphs of $u\mathcal{N}_{n}$ . The questions whether $u\mathcal{T}_{n}$ is isometrically embedded in $u\mathcal{N}_{n}$ under PR and NNI remains open. Answering these questions positively would also provide an answer to the question whether computing the shortest NNI-distance between two networks is NP-hard, and clues toward proving whether the PR-distance between two networks in the same tier is NP-hard. Further negative results that we have shown are that the spaces of tree-based networks and level- $k$ are not isometric subgraphs of the space of all phylogenetic networks.

Throughout this paper, we have restricted our attention to proper networks. We could also have chosen to use unrooted networks without the properness condition. This definition, which is mathematically more elegant, is used in most other papers, so it seems to be the obvious choice. However, it is not natural to have cut-edges that do not separate leaves: such networks carry no biological meaning. It is desirable that networks are rootable and thus have an evolutionary interpretation. Unrooted phylogenetic networks are rootable if they have at most one blob with one cut-edge. While using this in the definition of an unrooted phylogenetic network could therefore be sufficient, we go one step further, and ask that there is no such blob. This makes a network rootable at any leaf (i.e., with any taxon as out-group), which gives a stronger biological interpretation and usability.

The fact that our definition of unrooted phylogenetic networks is mathematically more restrictive, means that any positive result we have proven is likely also true when using a less restrictive definition. That is, connectedness for those definitions follows easily by finding sequences to proper networks, like done by Jansen et al. [JJE*+*18]. As we may be able to find short sequences for this purpose, the diameter results will likely also still hold. This means that whatever definitions may be used in practice, with minor additional arguments, our results provide the theoretical background necessary to justify local search operations.

\EdefEscapeHex

Acknowledgments.1Acknowledgments.1\EdefEscapeHexAcknowledgmentsAcknowledgments\[email protected]\hyper@anchorend

Acknowledgements

The first author was supported by the Netherlands Organization for Scientific Research (NWO) Vidi grant 639.072.602. The second author thanks the New Zealand Marsden Fund for their financial support.

\EdefEscapeHex

references.1references.1\EdefEscapeHexReferencesReferences\[email protected]\hyper@anchorend

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AS 01] B. L. Allen and M. Steel, “Subtree transfer operations and their induced metrics on evolutionary trees,” Annals of Combinatorics , vol. 5, no. 1, pp. 1–15, 2001. 10.1007/s 00026-001-8006-8 · doi ↗
2[BLS 17] M. Bordewich, S. Linz, and C. Semple, “Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks,” Journal of Theoretical Biology , vol. 423, pp. 1–12, 2017. 10.1016/j.jtbi.2017.03.032 · doi ↗
3[Bor 03] M. Bordewich, “The complexity of counting and randomised approximation,” Ph.D. dissertation, University of Oxford, 2003. http://community.dur.ac.uk/m.j.r.bordewich/papers/Bordewich 2003-a.pdf
4[BS 05] M. Bordewich and C. Semple, “On the computational complexity of the rooted subtree prune and regraft distance,” Annals of Combinatorics , vol. 8, no. 4, pp. 409–423, 2005. 10.1007/s 00026-004-0229-z · doi ↗
5[DGH 11] Y. Ding, S. Grünewald, and P. J. Humphries, “On agreement forests,” Journal of Combinatorial Theory, Series A , vol. 118, no. 7, pp. 2059–2065, 2011. 10.1016/j.jcta.2011.04.013 · doi ↗
6[DHJ + 97] B. Das Gupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, “On distances between phylogenetic trees,” in Proceedings of the 8. annual ACM-SIAM Symposium on Discrete Algorithms , 1997, pp. 427–436.
7[Die 17] R. Diestel, Graph Theory , 5th ed. Springer Berlin Heidelberg, 2017. 10.1007/978-3-662-53622-3 · doi ↗
8[FHM 18] A. Francis, K. T. Huber, and V. Moulton, “Tree-based unrooted phylogenetic networks,” Bulletin of Mathematical Biology , vol. 80, no. 2, pp. 404–416, 2018. 10.1007/s 11538-017-0381-3 · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Rearrangement operations on unrooted phylogenetic networks

Abstract

1 Introduction

2 Preliminaries

Phylogenetic networks.

Tiers.

Embedding.

Displaying.

Tree-based networks.

Level-kkk networks.

rrr-Burl.

rrr-Handcuffed trees and caterpillars.

Suboperations.

TBR.

PR.

NNI.

Sequences and distances.

3 Relations of rearrangement operations

Observation 3.1**.**

Lemma 3.2**.**

Proof.

Corollary 3.3**.**

Lemma 3.4**.**

Proof.

Lemma 3.5**.**

Proof.

4 Shortest paths

Observation 4.1**.**

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

Lemma 4.5**.**

Proof.

Corollary 4.6**.**

Lemma 4.7**.**

Proof.

Corollary 4.8**.**

Corollary 4.9**.**

Lemma 4.10**.**

Proof.

Corollary 4.11**.**

Corollary 4.12**.**

Theorem 4.13**.**

5 Connectedness and diameters

Theorem 5.1** (Li et al.[LTZ96], Ding et al.[DGH11]).**

5.1 Network space

Theorem 5.2**.**

Proof.

Corollary 5.3**.**

Corollary 5.4**.**

Theorem 5.5**.**

Proof.

Theorem 5.6**.**

Proof.

5.2 Networks displaying networks

Proposition 5.7**.**

Proof.

Corollary 5.8**.**

5.3 Tree-based networks

Theorem 5.9**.**

Proof.

Theorem 5.10**.**

Proof.

5.4 Level-kkk networks

Theorem 5.11**.**

Proof.

Theorem 5.12**.**

Proof.

6 Isometric relations between spaces

Theorem 6.1**.**

Proof.

Level- $k$ networks.

$r$ -Burl.

$r$ -Handcuffed trees and caterpillars.

Observation 3.1.

Lemma 3.2.

Corollary 3.3.

Lemma 3.4.

Lemma 3.5.

Observation 4.1.

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.

Lemma 4.5.

Corollary 4.6.

Lemma 4.7.

Corollary 4.8.

Corollary 4.9.

Lemma 4.10.

Corollary 4.11.

Corollary 4.12.

Theorem 4.13.

Theorem 5.1 (Li et al.[LTZ96], Ding et al.[DGH11]).

Theorem 5.2.

Corollary 5.3.

Corollary 5.4.

Theorem 5.5.

Theorem 5.6.

Proposition 5.7.

Corollary 5.8.

Theorem 5.9.

Theorem 5.10.

5.4 Level- $k$ networks

Theorem 5.11.

Theorem 5.12.

Theorem 6.1.

Lemma 6.2.

Lemma 6.3.

Theorem 6.4.

Theorem 6.5.

Theorem 7.1 ([DHJ+97, HDRCB08, AS01]).

Corollary 7.2.

Theorem 7.3.

Theorem 7.4.