Further Exploration of an Upper Bound for Kemeny’s Constant

Robert E. Kooij; Johan L. A. Dubbeldam

PMC · DOI:10.3390/e27040384·April 4, 2025

Further Exploration of an Upper Bound for Kemeny’s Constant

Robert E. Kooij, Johan L. A. Dubbeldam

PDF

Open Access

TL;DR

This paper explores a mathematical upper bound for Kemeny’s constant in graphs and shows how it can be used efficiently for large networks.

Contribution

The paper generalizes previous bounds for specific graph classes and demonstrates practical numerical approximations for large-scale networks.

Findings

01

The previously found bound is tight for bipartite and windmill graphs.

02

Numerical approximations using the bound are efficient for real-world networks.

03

The method provides a 30x speedup for graphs with up to 100K nodes.

Abstract

Even though Kemeny’s constant was first discovered in Markov chains and expressed by Kemeny in terms of mean first passage times on a graph, it can also be expressed using the pseudo-inverse of the Laplacian matrix representing the graph, which facilitates the calculation of a sharp upper bound of Kemeny’s constant. We show that for certain classes of graphs, a previously found bound is tight, which generalises previous results for bipartite and (generalised) windmill graphs. Moreover, we show numerically that for real-world networks, this bound can be used to find good numerical approximations for Kemeny’s constant. For certain graphs consisting of up to 100 K nodes, we find a speedup of a factor 30, depending on the accuracy of the approximation that can be achieved. For networks consisting of over 500 K nodes, the approximation can be used to estimate values for the Kemeny constant,…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases1

injury to

Mutations2

KG with N

Figures9

Click any figure to enlarge with its caption.

Funding3

—European Union’s Horizon 2020 research and innovation program
—Marie Sklodowska-Curie
—Dutch National Foundation

Keywords

Kemeny’s constanteffective graph resistancerandom walksspectral graph theorypseudo-inverse Laplacian05C5005C7505C82

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGraph theory and applications · Complex Network Analysis Techniques · Advanced Queuing Theory Analysis

Full text

1. Introduction

Kemeny’s constant, a graph metric first proposed in 1960 [1], links random walks, Markov chains, and spectral graph theory; see, for instance, [2,3,4].

An intuitive way to understand Kemeny’s constant is by random walks on a graph, which was also how it was originally presented by Kemeny [1]. For an undirected connected graph with an adjacency matrix A, we can define a transition matrix $[eqn]$ for the transition from state i to j, where $[eqn]$ is the degree of node i. This defines an irreducible finite-state Markov chain in discrete time with an $[eqn]$ transition matrix $[eqn]$ [5]. If we also know the mean first-passage time matrix $[eqn]$ denoting the average time to go from a vertex i to a vertex j (we take $[eqn]$ by convention), the Kemeny constant is defined by

[eqn]

where $[eqn]$ is the j-th component of the stationary solution of the random walk. The fact that $[eqn]$ does not depend on the index i, which can be interpreted as the starting state of the random walk and is therefore truly a constant, was discussed in a number of papers [6,7]. Hunter [8] and Kirkland [9] have analysed Relation (1) and established a connection with generalised matrix inverses.

The Kemeny constant also has an interpretation as a ‘mixing time’, which was originally proposed by Hunter in [7]. Here, we briefly repeat the demonstration that the Kemeny constant can be identified by a mixing time and show that this can be directly interpreted in terms of entropy. Let us define the ‘time to mixing’, T, of a Markov chain $[eqn]$ following [7], as the smallest index k at which $[eqn]$ , where Y is a random variable distributed according to the stationary distribution of the Markov chain $[eqn]$ . We can now calculate the conditional expectation value of T, $[eqn]$ ,

[eqn]

where $[eqn]$ is the mean first-passage time for going from node i to node j.

Expression Equation (2) for the mixing time permits an interpretation in terms of relative entropy or Kullback–Leiber divergence $[eqn]$ , which measures the distance between the distributions p and $[eqn]$ ; see also [10]. The relative entropy is defined as

[eqn]

Since $[eqn]$ with equality only when $[eqn]$ for all $[eqn]$ , the time to mixing can be interpreted as the smallest value of n for which the relative entropy $[eqn]$ .

Kemeny’s constant has recently also been suggested as a metric to identify bottleneck roads whose removal would greatly reduce the connectivity of the network [11] or as a metric to determine the ‘superspreader’ links that transmit disease between different communities [12].

It has already been established that there are several equivalent ways to express Kemeny’s constant: using effective graph resistance, random walks, spectral graph theory, and pseudo-inverse Laplacians; see [8].

The study of Kemeny’s constant is still an active and relevant research field, as was showcased by the mini-symposium “Kemeny’s constant on networks and its application”, which was organised as part of the 24th Conference of the International Linear Algebra Society, which took place in Galway, Ireland, 20–24 June 2022 [13] as well as recent papers addressing applications of Kemeny’s constant to different networks [14,15].

In 2017, Wang et al. [4] derived a closed-form formula for Kemeny’s constant, $[eqn]$ for a random walk on a graph G with N nodes and L edges, where the transition matrix was given by $[eqn]$ , where $[eqn]$ is the adjacency matrix of G and $[eqn]$ is a diagonal matrix containing the degrees of the nodes. In [4], it was shown that $[eqn]$ can be expressed in terms of the Moore–Penrose pseudo-inverse $[eqn]$ of the Laplacian matrix of G, as

[eqn]

where the column vector $[eqn]$ and $[eqn]$ denotes the degree vector for the graph.

In [4], not only Equation (3) was derived, but also a closely connected upper bound:

[eqn]

where $[eqn]$ is the average degree and $[eqn]$ is the largest eigenvalue of the Laplacian matrix $[eqn]$ corresponding to graph G. Here, $[eqn]$ denotes the diagonal matrix containing the degrees of the nodes. The heterogeneity index $[eqn]$ , measuring the variability in the degrees of the nodes (see [16]) is defined as

[eqn]

where $[eqn]$ is the degree of the i-th node.

It was shown in [17] that the upper bound given in Equation (4) is tight, meaning that we have an equality in Equation (4), for two classes of graphs, namely complete bipartite graphs and (generalised) windmill graphs. A windmill graph $[eqn]$ consists of $[eqn]$ copies of the complete graph $[eqn]$ , with each node connected to a common node. Two generalisations of windmill graphs were suggested by Kooij [18] in 2019. For both generalisations, we replace the central node, connecting all $[eqn]$ copies of the complete graph $[eqn]$ , with l central nodes. For the first generalisation, we assume that the l central nodes are all connected, i.e., they form a clique $[eqn]$ . We call this a generalised windmill graph of Type I and denote it by $[eqn]$ . For the second generalisation, we assume that the l central nodes have no connections among each other. We will refer to it as a Type II generalised windmill graph and denote it by $[eqn]$ . Figure 1 shows examples of a windmill graph and its two generalisations,

The aim of this paper is four-fold. First, we will consider a broad family of graphs, which contain complete bipartite and (generalised) windmill graphs as special cases, and show analytically that for these graphs, the bound Equation (4) is tight. Graphs in this family have in common that they are bimodal and have a diameter of two. However, we will also show that these conditions are not sufficient to ensure that Equation (4) is tight. Next, we compare the complexity of the computation of the upper-bound Equation (4) with the exact expression for Kemeny’s constant, given by Equation (3). In [17], we have already compared the exact value of $[eqn]$ with the upper bound for some real-world networks. However, the considered networks were of rather moderate size ( $[eqn]$ ). Here, we will assess the performance of $[eqn]$ on real-world networks of sizes up to around 365 K nodes and $[eqn]$ M edges.

Finally, in addition to Equation (4), we also assess the performance of an upper bound suggested by de Vriendt [19] based on the so-called resistance radius of a graph:

[eqn]

where the resistance radius $[eqn]$ is defined as

[eqn]

with $[eqn]$ denoting the resistance matrix and u the all-one vector. The upper-bound Equation (5) is tight for vertex-transitive graphs. Here, we remark that vertex-transitive graphs are rather exceptional and are typically highly symmetric; examples of vertex-transitive graphs are Cayley graphs and the Petersen graph [20]. We will show in this paper that the bound $[eqn]$ is not a good estimate for the Kemeny constant for the classes of graphs that are considered in this paper and that $[eqn]$ is in general a much better estimate.

2. A Family of Biregular Graphs with Diameter 2

2.1. Construction

The aim is to construct a family of graphs that contains the complete bipartite and (generalised) windmill graphs as special cases and is commonly known as the combination of two regular graphs, denoted $[eqn]$ . We start the construction by considering a $[eqn]$ -regular graph $[eqn]$ on $[eqn]$ nodes, and a $[eqn]$ -regular graph $[eqn]$ on $[eqn]$ nodes. We assume $[eqn]$ and also $[eqn]$ . Finally, we connect every node in $[eqn]$ to every node in $[eqn]$ to obtain the graph G. The nodes in G that are also in $[eqn]$ have degree $[eqn]$ , while the nodes in $[eqn]$ have degree $[eqn]$ . This construction yields a graph $[eqn]$ that is a so-called biregular graph in which all nodes of $[eqn]$ have the same degree and the same holds for all nodes of $[eqn]$ ; see also [21]. Only if $[eqn]$ is the graph G regular. By construction, G has diameter 2.

The choice of $[eqn]$ and $[eqn]$ leads to the complete bipartite graph $[eqn]$ . If we take $[eqn]$ isolated copies of the complete graph $[eqn]$ as $[eqn]$ and an isolated node for $[eqn]$ , then G is the windmill graph $[eqn]$ . If instead, we let $[eqn]$ be a complete graph $[eqn]$ , then G is a generalised windmill graph of Type I, $[eqn]$ , whereas if we let $[eqn]$ consist of l isolated nodes, G is a generalised windmill graph of Type II, $[eqn]$ .

Figure 2 shows an example of a graph that belongs to the suggested family of graphs. Here, $[eqn]$ , on the left side of the figure, is a random regular graph with $[eqn]$ , on $[eqn]$ nodes, while $[eqn]$ is a graph on $[eqn]$ nodes, where each node has degree $[eqn]$ . For the graph G, the nodes in $[eqn]$ have degree 11, while the nodes in $[eqn]$ have degree 15.

2.2. Tightness of the Upper Bound KU(G)

We will now show for the family of graphs proposed in the previous subsection that the upper-bound Equation (4) for Kemeny’s constant is tight.

Theorem 1. Consider two graphs $[eqn]$ and $[eqn]$ with all vertices in $[eqn]$ with degree $[eqn]$ and those in $[eqn]$ degree $[eqn]$ . If we connect each of the vertices in $[eqn]$ with all nodes of $[eqn]$ , then Kemeny’s constant $[eqn]$ for the resulting graph G is given by $[eqn]$ , that is, the upper-bound Equation (4) is tight.

Proof. First, we give expressions for the average degree D and the heterogeneity index H, which appear in the upper-bound Equation (4). Denoting the degrees of the nodes in G in $[eqn]$ and $[eqn]$ as $[eqn]$ and $[eqn]$ , respectively, we obtain

[eqn]

and

[eqn]

The average degree of G, $[eqn]$ , which we abbreviate for notational convenience to D, is defined by

[eqn]

The heterogeneity index $[eqn]$ , a metric which quantifies the variability of the degree distribution (see [16]), is defined as follows:

[eqn]

where $[eqn]$ denotes the degree of node i in graph G. Using the expressions for degrees $[eqn]$ and $[eqn]$ found in (7) and (8) and expression (9) for D, we obtain

[eqn]

We will now prove the statement by first calculating the Laplacian matrix Q for the graph G, which has the following special structure:

[eqn]

where $[eqn]$ is an all-one $[eqn]$ matrix, and the square matrices $[eqn]$ and $[eqn]$ are defined as

[eqn]

where $[eqn]$ , is the Laplacian of graph $[eqn]$ ( $[eqn]$ ), and $[eqn]$ denote the identity matrices of size $[eqn]$ and $[eqn]$ , respectively. The decomposition of Q into 4 blocks can be understood by realising that the upper right-hand block, $[eqn]$ , represents the $[eqn]$ links that exist between each vertex of $[eqn]$ and all the vertices of $[eqn]$ . Since Q is a Laplacian matrix, we have to ensure that all rows sum up to zero, which can be achieved by adding $[eqn]$ to each of the diagonal entries of the $[eqn]$ block in the upper left-hand corner, that is, the block $[eqn]$ should be as defined above. Analogously, we find that the lower left-hand and right-hand blocks should be equal to $[eqn]$ and $[eqn]$ , respectively.Two eigenvectors, $[eqn]$ and $[eqn]$ , can be found by inspection. $[eqn]$ , which corresponds to eigenvalue $[eqn]$ , and $[eqn]$ , which has $[eqn]$ entries equal to $[eqn]$ and $[eqn]$ entries equal to $[eqn]$ and corresponds to $[eqn]$ .Because the largest Laplacian eigenvalue is upper-bounded by N, the number of nodes in a graph (see [22]), we directly obtain that $[eqn]$ is the largest eigenvalue of Q. Combining this with Equations (9)–(11), we obtain

[eqn]

Since eigenvectors corresponding to different eigenvalues are all orthogonal and those corresponding to the same eigenvalues can be chosen to be orthogonal, due to the symmetry of Q, all eigenvectors $[eqn]$ that are not equal to $[eqn]$ or $[eqn]$ are subject to

[eqn]

which leads to

[eqn]

We next turn to the expression $[eqn]$ , where $[eqn]$ where $[eqn]$ is the i-th eigenvalue of Q and $[eqn]$ is the normalised eigenvector. The conditions for the eigenvectors (15) imply that all terms in the expression $[eqn]$ vanish except the term associated with $[eqn]$ . More precisely, we find that

[eqn]

where $[eqn]$ , so the first $[eqn]$ components all have degree $[eqn]$ and the remaining components have degree $[eqn]$ , which implies $[eqn]$ by Equation (15). Finally, because from $[eqn]$ , we obtain

[eqn]

it follows that $[eqn]$ equals Equation (14), which proves the proposition. □

2.3. Some Examples

As a first example, we consider the graph depicted in Figure 2, where $[eqn]$ , $[eqn]$ , $[eqn]$ and $[eqn]$ . Using Python (https://www.python.org/) code, we have evaluated both K and $[eqn]$ . For this network, we obtain $[eqn]$ , which is equal to $[eqn]$ to numerical precision, as should be according to Theorem 1. On the other hand, the upper bound $[eqn]$ based upon the resistance radius gives $[eqn]$ , which is reasonably close to the actual value.

Next, we consider a graph where $[eqn]$ , $[eqn]$ , $[eqn]$ and $[eqn]$ ; see Figure 3. Here, we get $[eqn]$ , and again, K and $[eqn]$ are numerically extremely close. On the other hand, for this graph, the bound Equation (5) is two orders larger than the actual value: $[eqn]$ .

As a final example, we consider the case where $[eqn]$ , $[eqn]$ , $[eqn]$ and $[eqn]$ ; see Figure 4. Now, $[eqn]$ and again K and $[eqn]$ are equal to numerical precision. Again, the bound based on the resistance radius is much higher: $[eqn]$ .

We end this subsection by noting that the choice for the examples in this subsection was rather arbitrary. We also ran our Python script on several other graphs with sizes up to 1500 nodes. Each time, it yielded the same result: K and $[eqn]$ have values that are numerically very close (see also [4,17] for more numerical comparisons), while the upper bound $[eqn]$ exceeds Kemeny’s constant by a few orders.

3. Graphs with Diameter 2 for Which KU(P) Is Not Tight

3.1. Bimodal Graphs with Diameter 2 for Which Equation (4) Is Not Tight

The numerical results of the examples on biregular graphs with diameter 2 from the previous section showed that in all these cases, the approximation of K by $[eqn]$ is actually exact. In other words, the bound $[eqn]$ is tight in these cases. Therefore, one might be tempted to believe that Equation (4) is tight for all biregular graphs with diameter 2. In this section, we prove that this is not the case by giving some counterexamples.

The simplest counterexample we could find consists of the cycle graph $[eqn]$ with an additional link; see Figure 5.

For this graph, we get $[eqn]$ , $[eqn]$ and $[eqn]$ . There is a simple procedure to check whether or not a biregular graph G with diameter 2 belongs to the graph family constructed in the previous section. First, partition the nodes into two sets $[eqn]$ and $[eqn]$ where all the nodes in the set $[eqn]$ have degree $[eqn]$ , while all the nodes in the set $[eqn]$ have degree $[eqn]$ . Next, verify if the number of links between the 2 sets is $[eqn]$ and all nodes of $[eqn]$ are linked to all nodes of $[eqn]$ . If this is not the case, the graph $[eqn]$ . In the other case, remove all $[eqn]$ links between the sets $[eqn]$ and $[eqn]$ . If the remaining two graphs are not both regular, then the original graph G does not belong to the family constructed in the previous section, that is, $[eqn]$

The second counterexample is constructed by adding a link to the Petersen graph; see Figure 6.

For this graph, we get $[eqn]$ , $[eqn]$ and $[eqn]$ .

3.2. Non-Biregular Graphs with Diameter 2

We now give an example of a non-biregular graph with diameter 2, for which the upper-bound Equation (4) also does not equal Kemeny’s constant. We construct the graph by first taking a complete graph $[eqn]$ on N nodes. Next, we add one node and connect it to one node in $[eqn]$ and therefore the resulting graph has diameter 2. The resulting graph has $[eqn]$ nodes with degree $[eqn]$ , one node with degree N, and one node with degree 1. Figure 7 shows an example with $[eqn]$ .

Applying Equation (3), we get $[eqn]$ , while the upper bound of Equation (4) gives $[eqn]$ , while $[eqn]$ .

4. Regular Graphs

In this section, we consider regular graphs on N nodes with degree r. In this case, the relation between Kemeny’s constant and the effective graph resistance was shown [23] to be

[eqn]

where $[eqn]$ denotes the effective graph resistance. Next, we show that for these graphs, the upper-bound Equation (4) is also tight. For this, we will use the following expression for the effective graph resistance (see [4]):

[eqn]

For r-regular graphs, $[eqn]$ , and therefore Equation (4) gives

[eqn]

hence $[eqn]$ according to Equation (18).

As an example, we consider a random 3-regular graph on 100 nodes (see Figure 8), which has a diameter 10. We get numerically $[eqn]$ , which is indeed equal to $[eqn]$ up to the numerical precision of $[eqn]$ . Applying Equation (5) gives $[eqn]$ . In this case, the upper-bound Equation (5) is not tight because the graph is not vertex-transitive.

5. Complexity for the Computation of KU(P)

The time complexity of $[eqn]$ , computed via Equation (3), is dominated by the Laplacian pseudo-inverse, which is as expensive as performing a dense matrix multiplication and takes $[eqn]$ in practice with standard tools. On the other hand, the time complexity of $[eqn]$ mainly depends on two operations: computing the largest Laplacian eigenvalue and performing the dot product of a degree vector and the diagonal element vector of the Laplacian pseudo-inverse. Interestingly, to compute $[eqn]$ , we can avoid the full pseudo-inversion as it only requires the diagonal elements of the Laplacian pseudo-inverse. Algorithms that approximate the diagonal (or the trace) of matrices often use iterative methods, sparse direct methods [24], Monte Carlo [25] or deterministic probing techniques [26]. Although faster than computing the full inversion, these approaches are still time-consuming in practice for large graphs [27]. For that reason, we employ a recently proposed algorithm that approximates the diagonal entries of the Laplacian pseudo-inverse using combinatorial connections [27]. This algorithm exploits the relation between effective resistance and the pseudo-inverse Laplacian. In order to calculate the diagonal elements of $[eqn]$ , it is sufficient to compute the electrical farness $[eqn]$ of each node u in the set of all nodes V; the farness is defined by

[eqn]

Here, $[eqn]$ is the effective resistance between node u and v, which is the potential difference between u and v when a unit current is injected in graph G at node u and extracted at node v [28]. Rather than calculate $[eqn]$ for each pair of nodes, we sample a set of uniform spanning trees. This approach provides a probabilistic absolute approximation guarantee.

The algorithm’s time complexity is summarised in the following proposition:

Proposition 1([27]). Let $[eqn]$ be an undirected and weighted graph with N nodes and L edges. The sampling algorithm, briefly described above, gives an approximation of the diagonal elements of $[eqn]$ with absolute error $[eqn]$ with probability $[eqn]$ in an expected time $[eqn]$ , where $[eqn]$ is the length of the longest shortest path (eccentricity) starting in a selected node u. For small-world graphs and $[eqn]$ (for high probability), this yields a time complexity of $[eqn]$ .

For networks that have small-world characteristics, a common feature for many real-world networks [29], the above algorithm obtains a $[eqn]$ -approximation with high probability, in a time that is linear in L up to polylogarithmic terms and quadratic in $[eqn]$ . Furthermore, computing the largest Laplacian eigenvalue does not change the overall complexity bound. More precisely, this step often takes $[eqn]$ time for sparse matrices using standard iterative methods, such as the Lanczos algorithm [30]. In general, the actual running time for this step highly depends on the desired accuracy and the eigenvalue distribution of the involved matrix. Overall, the complexity bound for computing $[eqn]$ for small-world graphs using the above techniques is linear in the number of links L (up to a polylogarithmic factor).

6. Analysis of Some Large Real-World Networks

In this section, we analyse the performance of our proposed bound, $[eqn]$ , compared to Kemeny’s constant, $[eqn]$ , in terms of accuracy and running time results. For $[eqn]$ , our implementation uses the NetworKit [31] graph library to compute the diagonal elements of $[eqn]$ (via the algorithm of Angriman et al. [27]) and the Slepc library (https://slepc.upv.es/) (accessed on 2 December 2024) to compute the largest Laplacian eigenvalue. $[eqn]$ , in turn, is computed via Equation (3) and our implementation uses the Eigen library (http://eigen.tuxfamily.org) (accessed on 2 December 2024) to compute the entire pseudo-inverse, $[eqn]$ . We do not include any comparisons against $[eqn]$ since, computationally, it is as expensive as the exact computation of Kemeny’s constant. Our test machine is a shared-memory server with a 2x 18-Core Intel Xeon 6154 CPU and a total of 1.5 TB RAM. To ensure reproducibility, experiments are managed by SimexPal [32]. In Table 1, we list the real-world graphs that are used in our experiments, downloaded from SNAP [33] and NR [34] public repositories. In this context, we consider as medium graphs those whose vertex count is <57 K. The largest graph has around 365 K nodes and $[eqn]$ M edges.

For the medium graphs of Table 1, we are able to compare our bound $[eqn]$ relatively to Kemeny’s constant $[eqn]$ , and the results are illustrated in Figure 9. $[eqn]$ is computed with different error bounds ( $[eqn]$ ) for the approximation of the diagonal elements (via the algorithm of Angriman et al. [27])—they correspond to the respective numbers next to the names in Figure 9. Regarding the accuracy, we observe that our approach for computing $[eqn]$ is overall highly accurate for all values of $[eqn]$ and graphs. More precisely, on average (computed via geometric mean) over the medium-size graphs, our approach is 0.33% 0.27% 0.25% and 1.26% away from the exact Kemeny’s constant for $[eqn]$ and $[eqn]$ , respectively. Meanwhile, the running time is on average $[eqn]$ and 141× faster than the exact computation for each $[eqn]$ , respectively. Figure 9a shows that on individual graphs, a larger $[eqn]$ value ( $[eqn]$ ) may result in a slightly less accurate bound—up to 10% away from the exact value (arx). Moreover, in Figure 9b, we observe that for the inf graph, computing the exact Kemeny’s constant is much faster than computing $[eqn]$ via Algorithm [27]. The primary reason for that is the small size (6K edges) for which an exact computation of the entire pseudo-inverse is still fast enough. A second reason for the slow performance of the algorithm of Angriman et al. could be due to the high diameter of the graph in question (≫ $[eqn]$ ).

In Table 2, we illustrate our results for the largest graphs of Table 1. For this experiment, we set $[eqn]$ for the approximation of the diagonal elements of $[eqn]$ as this offers the best trade-off between accuracy and speed, according to the previous experiment. Unfortunately, we were not able to compute exact values for Kemeny’s constant for these graphs, as all involved runs timed out at 18,000 s. This is due to the prohibitive time and space complexity of the pseudo-inversion operation required by $[eqn]$ .

7. Conclusions

We have investigated Kemeny’s constant $[eqn]$ for a number of networks using the exact expression from [4] and compared this expression with two upper bounds: one $[eqn]$ that was derived in Ref. [19] and is known to be tight for vertex-transitive graphs, and the other bound $[eqn]$ was derived in [4] and is written in terms of degrees of the nodes, the diagonal elements of the pseudo-inverse Laplacian, the largest eigenvalue of the Laplacian matrix and the heterogeneity of the degrees of the nodes.

We have numerically demonstrated that the bound $[eqn]$ is generally a much better approximation for $[eqn]$ than $[eqn]$ for the networks that we have explored. Moreover, we have proved that for any graph G composed of two regular graphs $[eqn]$ and $[eqn]$ with all nodes of the graph $[eqn]$ connected to each node of $[eqn]$ , the bound $[eqn]$ is tight. This generalises earlier findings that the bound $[eqn]$ is tight for (generalised) windmill and complete bipartite graphs.

As an illustration of the advantages of using the expression $[eqn]$ to estimate the Kemeny constant, we numerically calculated the Kemeny constant for a number of real-world large networks. We find that the calculation of $[eqn]$ can be performed very efficiently, displaying efficiency gains in the order of a factor 100–1000, for networks up to 57 K nodes. The upper bound can still be obtained in a reasonable time for networks up to 365 K nodes.

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Kemeny J.G. Snell J.L. Finite Markov Chains D. Van Nostrand Princeton, NJ, USA 1960
2Lovász L. Random Walks on Graphs: A Survey Paul Erdös is Eighty Bolyai Society, Mathematical Studies Keszthely, Hungary 1993 Volume 2146
3Palacios J.L. Renom J.M. Bounds for the Kirchhoff index of regular graphs via the spectra of their random walks Int. J. Quantum Chem.20101101637164110.1002/qua.22323 · doi ↗
4Wang X. Dubbeldam J.L.A. Van Mieghem P. Kemeny’s constant and the effective graph resistance Linear Algebra Its Appl.2017535231244
5Noh J.D. Rieger H. Random Walks on Complex Networks Phys. Rev. Lett.20049211870110.1103/Phys Rev Lett.92.11870115089179 · doi ↗ · pubmed ↗
6Levene M. Loizou G. Kemeny’s Constant and the Random Surfer Am. Math. Mon.200210974174510.1080/00029890.2002.11919905 · doi ↗
7Hunter J. Mixing times with applications to perturbed Markov chains Linear Algebra Its Appl.200641710812310.1016/j.laa.2006.02.008 · doi ↗
8Hunter J.J. The role of Kemeny’s constant in properties of Markov chains Commun. Stat.-Theory Methods 20144313091321