A Stochastic Process on a Network with Connections to Laplacian Systems   of Equations

Iqra Altaf Gillani; Amitabha Bagchi; Pooja Vyavahare

arXiv:1701.05296·cs.NI·July 26, 2019

A Stochastic Process on a Network with Connections to Laplacian Systems of Equations

Iqra Altaf Gillani, Amitabha Bagchi, Pooja Vyavahare

PDF

Open Access

TL;DR

This paper analyzes a queueing network model for multi-hop sensor data collection, revealing critical data rates that determine system stability and connecting the process to Laplacian systems relevant for distributed algorithms.

Contribution

It introduces a stochastic process model for sensor networks, establishes a phase transition based on data rate, and links the process to Laplacian systems for computational applications.

Findings

01

Existence of a critical data rate separating ergodic and non-ergodic regimes.

02

Geometric convergence to stationarity in the sub-critical regime.

03

Connections to Laplacian systems for efficient distributed algorithms.

Abstract

We study an open discrete-time queueing network that models the collection of data in a multi-hop sensor network. We assume data is generated at the sensor nodes as a discrete-time Bernoulli process. All nodes in the network maintain a queue and relay data, which is to be finally collected by a designated sink. We prove that the resulting multi-dimensional Markov chain representing the queue size of nodes has two behavior regimes depending on the value of the rate of data generation. In particular, we show that there is a non-trivial critical value of data rate below which the chain is ergodic and converges to a stationary distribution and above which it is non-ergodic, i.e., the queues at the nodes grow in an unbounded manner. We show that the rate of convergence to stationarity is geometric in the sub-critical regime. We also show the connections of this process to a class of…

Tables1

Table 1. Table 1: Rate bounds for various graphs with w : E → 𝟏 : 𝑤 → 𝐸 1 w:E\rightarrow\bm{1} , | V s | = 1 subscript 𝑉 𝑠 1 |V_{s}|=1 such that ∑ i ∈ V s 𝑱 ( i ) = 1 subscript 𝑖 subscript 𝑉 𝑠 𝑱 𝑖 1 \sum_{i\in V_{s}}\bm{J}(i)=1

Graph	$β \geq \frac{(1 - λ_{2}^{w})}{\sum_{i \in V_{s}} 𝑱 (i)} \frac{\sqrt{d_{\min} d_{u_{s}}}}{(d_{\max} + d_{u_{s}})}$	Exact rate
Cycle	$\frac{1}{2 n^{2}}$	$\frac{2}{n}$
Star Graph with sink at centre
and $ϵ$ as self loop probability at each node	$\frac{1}{2 \sqrt{n - 1}}$	$1 - ϵ$
Star Graph with sink and source
at outer node	$\frac{1}{n}$	$\frac{1}{n - 1}$
Complete graph	$\frac{n}{2 (n - 1)}$	$\frac{n}{2 (n - 1)}$
Random Geometric Graph	$\frac{\log n}{2 n}$	-
Wheel Graph $W_{n + 1}$ with sink at centre
and source at one of the cycle vertices	$\frac{\log n \sqrt{3 n}}{2 n^{2}}$	$\frac{1}{3}$
Wheel Graph $W_{n + 1}$ with source at centre
and sink at one of the cycle vertices	$\frac{3 \log n}{n (n + 1)}$	$\frac{5}{3 n}$
Complete Binary tree with both
source and sink at leaves	$\frac{1}{4 n}$	$\frac{1}{6 \log n - 3}$
$k$ -times star of star graph
with both source and sink at leaves	$\frac{1}{n^{2} + n^{\frac{2 k - 1}{k}}}$	$\frac{1}{1 + (2 k - 1) n^{1 / k}}$
$k$ -times star of star graph
with source at center and sink at leaf	$\frac{1}{2 n^{\frac{4 k - 1}{2 k}}}$	$\frac{1}{1 + (k - 1) n^{1 / k}}$

Equations96

η^{T} (I - P_{w}) = β J^{T},

η^{T} (I - P_{w}) = β J^{T},

β^{*} \geq \frac{( 1 - λ _{2}^{w} )}{\sum _{i \in V_{s}} J ( i )} \leavevmode \frac{d _{m i n} d _{u_{\mbox s}}}{( d _{m a x} + d _{u_{\mbox s}} )} .

β^{*} \geq \frac{( 1 - λ _{2}^{w} )}{\sum _{i \in V_{s}} J ( i )} \leavevmode \frac{d _{m i n} d _{u_{\mbox s}}}{( d _{m a x} + d _{u_{\mbox s}} )} .

∥ P^{t} [x, \cdot] - π ∥ \leq R C_{x} ρ^{t} .

∥ P^{t} [x, \cdot] - π ∥ \leq R C_{x} ρ^{t} .

∣∣ P^{t} [x, \cdot] - P^{t} [y, \cdot] ∣ ∣_{T V} \leq 2^{(\frac{8 t _{\mbox hi t} + ( 8 m a x { N ^{(x)} , N ^{(y)} } - 1 ) δ}{4 t _{\mbox hi t} + δ})} \cdot (\frac{1}{2})^{\frac{δ}{4 t _{\mbox hi t} + δ} \cdot t} .

∣∣ P^{t} [x, \cdot] - P^{t} [y, \cdot] ∣ ∣_{T V} \leq 2^{(\frac{8 t _{\mbox hi t} + ( 8 m a x { N ^{(x)} , N ^{(y)} } - 1 ) δ}{4 t _{\mbox hi t} + δ})} \cdot (\frac{1}{2})^{\frac{δ}{4 t _{\mbox hi t} + δ} \cdot t} .

∣∣ P^{t} [x, \cdot] - π ∣ ∣_{T V} \leq 4 \cdot 2^{m a x {\frac{8 N ^{(x)} δ}{4 t _{\mbox hi t} + δ}, \frac{1}{2 ( 1 - δ ) β ^{*} + 1}}} \cdot (\frac{1}{2})^{\frac{δ}{( 4 t _{\mbox hi t} + δ ) ( 2 ( 1 - δ ) β ^{*} + 1 )} \cdot t} .

∣∣ P^{t} [x, \cdot] - π ∣ ∣_{T V} \leq 4 \cdot 2^{m a x {\frac{8 N ^{(x)} δ}{4 t _{\mbox hi t} + δ}, \frac{1}{2 ( 1 - δ ) β ^{*} + 1}}} \cdot (\frac{1}{2})^{\frac{δ}{( 4 t _{\mbox hi t} + δ ) ( 2 ( 1 - δ ) β ^{*} + 1 )} \cdot t} .

t \to \infty lim P [∣∣ Q_{t}^{J, β} ∣ ∣_{\infty} < x] = F (x), \mbox an d x \to \infty lim F (x) = 1

t \to \infty lim P [∣∣ Q_{t}^{J, β} ∣ ∣_{\infty} < x] = F (x), \mbox an d x \to \infty lim F (x) = 1

x \to \infty lim t \to \infty lim in f P [∣∣ Q_{t}^{J, β} ∣ ∣_{\infty} < x] = 1

x \to \infty lim t \to \infty lim in f P [∣∣ Q_{t}^{J, β} ∣ ∣_{\infty} < x] = 1

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

+ v : v \sim u \sum P_{w} [v, u] \mbox 1_{{Q_{t}^{β} (v) > 0}} + A_{t} (u)

g^{'} (β) = d β \to 0 lim (g (β) - g (β - d β) ∣ Λ_{t}^{β} = 1) (∣ V_{s} ∣ t (1 - d β)^{∣ V_{s} ∣ t - 1})

g^{'} (β) = d β \to 0 lim (g (β) - g (β - d β) ∣ Λ_{t}^{β} = 1) (∣ V_{s} ∣ t (1 - d β)^{∣ V_{s} ∣ t - 1})

P [Q_{t}^{β} (u) > 0∣ Q_{t}^{β - d β} (u) = 0] = \frac{P [ Q _{t}^{β} ( u ) > 0 \cap Q _{t}^{β - d β} ( u ) = 0 ]}{P [ Q _{t}^{β - d β} ( u ) = 0 ]} = t^{'} = 1 \sum t d β P_{t^{'}}

P [Q_{t}^{β} (u) > 0∣ Q_{t}^{β - d β} (u) = 0] = \frac{P [ Q _{t}^{β} ( u ) > 0 \cap Q _{t}^{β - d β} ( u ) = 0 ]}{P [ Q _{t}^{β - d β} ( u ) = 0 ]} = t^{'} = 1 \sum t d β P_{t^{'}}

P [Q_{t}^{β} (u) > 0] - P [Q_{t}^{β - d β} (u) > 0] \leq t^{'} = 1 \sum t d β P_{t^{'}} .

P [Q_{t}^{β} (u) > 0] - P [Q_{t}^{β - d β} (u) > 0] \leq t^{'} = 1 \sum t d β P_{t^{'}} .

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

+ v : v \sim u \sum P_{w} [v, u] \mbox 1_{{Q_{t}^{β} (v) > 0}} + β J (u) .

- v : u^{*} \sim v \sum P_{w} [u^{*}, v] + v : u^{*} \sim v, v \in V ∖ {u_{\mbox s}} \sum P_{w} [v, u^{*}] + β J (u^{*})

- v : u^{*} \sim v \sum P_{w} [u^{*}, v] + v : u^{*} \sim v, v \in V ∖ {u_{\mbox s}} \sum P_{w} [v, u^{*}] + β J (u^{*})

I_{t} (u) = v \in U \sum A_{t}^{uv} + v \in P \sum A_{t}^{uv} .

I_{t} (u) = v \in U \sum A_{t}^{uv} + v \in P \sum A_{t}^{uv} .

- v : u \sim v \sum P_{w} [u, v] + u \sim v, v \in P \sum P_{w} [v, u] + u \sim v, v \in U \sum P_{w} [v, u] P_{w} [\overset{ˉ}{Q}_{t}^{β, U} (u) > 0] + β J (u) .

- v : u \sim v \sum P_{w} [u, v] + u \sim v, v \in P \sum P_{w} [v, u] + u \sim v, v \in U \sum P_{w} [v, u] P_{w} [\overset{ˉ}{Q}_{t}^{β, U} (u) > 0] + β J (u) .

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

\mbox E [Q_{t + 1}^{β} (u) ∣ Q_{t}^{β} (u)]

+ v : v \sim u \sum P_{w} [v, u] \mbox 1_{{Q_{t}^{β} (v) > 0}} + A_{t} (u),

\mbox E [Q_{t + 1}^{β} (u)] = \mbox E [Q_{t}^{β} (u)] - η_{t}^{β} (u) v : v \sim u \sum P_{w} [u, v] + v : v \sim u \sum P_{w} [v, u] η_{t}^{β} (v) + β J (u) .

\mbox E [Q_{t + 1}^{β} (u)] = \mbox E [Q_{t}^{β} (u)] - η_{t}^{β} (u) v : v \sim u \sum P_{w} [u, v] + v : v \sim u \sum P_{w} [v, u] η_{t}^{β} (v) + β J (u) .

- η^{β} (u) v : v \sim u \sum P_{w} [u, v] + v : v \sim u \sum P_{w} [v, u] η^{β} (v) + β J (u) = 0.

- η^{β} (u) v : v \sim u \sum P_{w} [u, v] + v : v \sim u \sum P_{w} [v, u] η^{β} (v) + β J (u) = 0.

η^{T} (I - P_{w}) = β J^{T} .

η^{T} (I - P_{w}) = β J^{T} .

x^{T} L = β J^{T}

x^{T} L = β J^{T}

η^{T} (I - P_{w}) = β J^{T} .

η^{T} (I - P_{w}) = β J^{T} .

η^{T} (I - P_{w})

η^{T} (I - P_{w})

\geq (1 - λ_{2}^{w}) i = 2 \sum ∣ V ∣ ⟨ η^{T}, f_{i} ⟩_{μ} f_{i} .

i = 2 \sum n ⟨ η^{T}, f_{i} ⟩_{μ}^{2} = ∥ η^{T} ∥_{μ}^{2} - ⟨ η^{T}, f_{1} ⟩_{μ}^{2} .

i = 2 \sum n ⟨ η^{T}, f_{i} ⟩_{μ}^{2} = ∥ η^{T} ∥_{μ}^{2} - ⟨ η^{T}, f_{1} ⟩_{μ}^{2} .

\sum\limits_{i=2}^{n}{\langle\bm{\eta}^{T},f_{i}\rangle_{\mu}^{2}}=\sum\limits_{i=1}^{n}{\bm{\eta}^{2}(i)\mu(i)}-\Bigg{(}\sum\limits_{i=1}^{n}{\bm{\eta}(i)\mu(i)}\Bigg{)}^{2}=\mbox{Var}_{\mu}(\bm{\eta}(i))=\sum\limits_{i=1}^{n}(\bm{\eta}(i)-\bar{\bm{\eta}}_{\mu})^{2}\mu(i)

\sum\limits_{i=2}^{n}{\langle\bm{\eta}^{T},f_{i}\rangle_{\mu}^{2}}=\sum\limits_{i=1}^{n}{\bm{\eta}^{2}(i)\mu(i)}-\Bigg{(}\sum\limits_{i=1}^{n}{\bm{\eta}(i)\mu(i)}\Bigg{)}^{2}=\mbox{Var}_{\mu}(\bm{\eta}(i))=\sum\limits_{i=1}^{n}(\bm{\eta}(i)-\bar{\bm{\eta}}_{\mu})^{2}\mu(i)

∥ η^{T} (I - P_{w}) ∥_{μ}^{2} \geq (1 - λ_{2}^{w})^{2} \mbox V a r_{μ} (η (i)) .

∥ η^{T} (I - P_{w}) ∥_{μ}^{2} \geq (1 - λ_{2}^{w})^{2} \mbox V a r_{μ} (η (i)) .

β \geq \frac{( 1 - λ _{2}^{w} )}{∥ J ^{T} ∥ _{μ}} \leavevmode \mbox V a r_{μ} (η (i)) .

β \geq \frac{( 1 - λ _{2}^{w} )}{∥ J ^{T} ∥ _{μ}} \leavevmode \mbox V a r_{μ} (η (i)) .

∥ J^{T} ∥_{μ}

∥ J^{T} ∥_{μ}

\mbox V a r_{μ} (η (i)) \geq (1 - δ - \overset{ˉ}{η}_{μ})^{2} μ (u_{m a x}) + (\overset{ˉ}{η}_{μ} - 0)^{2}) μ (u_{\mbox s}) \geq \frac{( 1 - δ ) ^{2} μ ( u _{m a x} ) μ ( u _{\mbox s} )}{μ ( u _{m a x} ) + μ ( u _{\mbox s} )} .

\mbox V a r_{μ} (η (i)) \geq (1 - δ - \overset{ˉ}{η}_{μ})^{2} μ (u_{m a x}) + (\overset{ˉ}{η}_{μ} - 0)^{2}) μ (u_{\mbox s}) \geq \frac{( 1 - δ ) ^{2} μ ( u _{m a x} ) μ ( u _{\mbox s} )}{μ ( u _{m a x} ) + μ ( u _{\mbox s} )} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Queuing Theory Analysis · Simulation Techniques and Applications · Distributed systems and fault tolerance

Full text

A Stochastic Process on a Network with Connections to Laplacian Systems of Equations

Iqra Altaf Gillani

{iqraaltaf,bagchi}@cse.iitd.ac.in

Department of Computer Science and Engineering, IIT Delhi

Amitabha Bagchi

{iqraaltaf,bagchi}@cse.iitd.ac.in

Department of Computer Science and Engineering, IIT Delhi

Pooja Vyavahare

[email protected]

Department of Electrical Engineering, IIT Tirupati

Abstract

We study an open discrete-time queueing network that models the collection of data in a multi-hop sensor network. We assume data is generated at the sensor nodes as a discrete-time Bernoulli process. All nodes in the network maintain a queue and relay data, which is to be finally collected by a designated sink. We prove that the resulting multi-dimensional Markov chain representing the queue size of nodes has two behavior regimes depending on the value of the rate of data generation. In particular, we show that there is a non-trivial critical value of data rate below which the chain is ergodic and converges to a stationary distribution and above which it is non-ergodic, i.e., the queues at the nodes grow in an unbounded manner. We show that the rate of convergence to stationarity is geometric in the sub-critical regime. We also show the connections of this process to a class of Laplacian systems of equations whose solutions include the important problem of finding the effective resistance between two nodes, a subroutine that has been widely used to develop efficient algorithms for a number of computational problems. Hence our work provides the theoretical basis for a new class of distributed algorithms for these problems.

Keywords: Ergodicity; Geometric Ergodicity; Random walks; Queueing networks; Stationary distribution

1 Introduction

We study a stochastic process arising from a natural routing and scheduling scheme used to collect data from sensor nodes over multi-hop relay networks [9, 10, 1]. We model the sensor network as a graph, some of whose vertices produce data packets according to a discrete-time Bernoulli process. One node is designated as a sink that has to collect the data generated in the network and all other nodes relay the data. Each node maintains a queue and relays at most one packet in a time slot in the manner of the “gossip” models widely studied in the networking and distributed computing literature [12][3][24]. The packet is relayed to a random neighbor, in the manner of a random walk on the graph. Our model is, therefore, an open discrete-time queueing network whose interconnections are described by an undirected (simple) graph.

Due to the relationship with the data collection task we call our process the Data Collection Process defined on a graph $G=(V,E)$ equipped with a positive edge-weight function $w:E\rightarrow\mathbb{R}_{+}$ . The process takes two parameters, a relative rate vector $\bm{J}\in(0,1)^{|V|}$ and a rate $\beta\in(0,1)$ ; we assume that node $v\in V$ produces a packet with probability $\beta\bm{J}(v)$ in a given time slot. For a given relative rate vector, the process has two behavior regimes and undergoes a sharp transition between these two regimes, the controlling parameter being the rate $\beta$ . Specifically, we will show that for a critical value $\beta^{*}$ we have that when $\beta>\beta^{*}$ the process is non-ergodic, and the size of the queues grows to infinity, whereas, when $\beta<\beta^{*}$ process is ergodic such that all queues are almost surely finite and the system converges to a stationary distribution. For this latter regime, we also show that the rate of convergence is geometric, i.e., the Data Collection Process is geometrically ergodic.

For $\beta<\beta^{*}$ the process also has an unexpected connection with a subclass of systems of linear equations, which we refer to as “one-sink” Laplacian systems. The importance of this subclass comes from the fact that the effective resistance between a pair of nodes in a network can be computed by solving a one-sink Laplacian system [26] [23]. Over the last few years this connection to effective resistance has been repeatedly exploited to develop state-of-the-art algorithms for computing max flows in networks [5][2], random spanning trees of graphs [11][23], graph sparsification [26][13], and expander generation [6]. The connection of our Data Collection Process to one-sink Laplacian systems opens up a new direction for the design of efficient distributed algorithms computing an array of important structures and quantities on graphs as we have shown in [8]. However, efficient algorithms based on the Data Collection Process depend on fundamental mathematical properties of the process. Specifically, a stationary distribution must exist, and convergence towards it must be guaranteed in a reasonable time. This paper addresses those needs.

The rest of the paper is organized as follows. In Section 2, we discuss our main results. In Section 3, we prove existence of a non-trivial critical data rate below which the process is ergodic and above which it is non-ergodic. Then, in Section 4 we characterize this rate in terms of underlying graph parameters. In Section 5, we prove that the process is not only ergodic but geometrically ergodic and find the rate of convergence of the associated Markov chain to its stationary distribution. Finally, we conclude and give some directions for future work in Section 6.

2 Main results

2.1 Our model: The Data Collection Process

We consider a stochastic process on a network modeled by an undirected graph $G=(V,E,w)$ , where $V$ is the set of $n$ nodes, $E$ is the set of edges such that $|E|=m$ , and a positive weight function $w:E\rightarrow\mathbb{R}_{+}$ . We say that $u\sim v$ if $(u,v)\in E$ and $\mbox{Nbd}(u):=\{v\in V|(u,v)\in E\}.$ The generalized degree of node $u$ is defined as $\mbox{deg}(u):=\sum_{v\in\mbox{Nbd}(u)}w_{uv}$ . We denote the maximum and minimum generalized degree among all nodes in the network by $d_{\max}$ and $d_{\min}$ respectively.

We consider time to be discrete and define the process in terms of the generation, movement and disappearance of “packets” from the system. In order to do this we are given a relative rate vector $\bm{J}\in\mathbb{R}^{n}$ with the properties that (i) $\bm{J}(v)<0$ for exactly one node and (ii) $\sum_{i=1}^{n}\bm{J}(i)=0$ . The node $v$ for which $\bm{J}(v)<0$ is called the sink and we will use $u_{\mbox{\scriptsize s}}$ to denote it hereafter. We also define a set of source nodes: $V_{s}=\{v:\bm{J}(v)>0\}$ . We are also given a rate parameter $\beta\geq 0$ such that $\max_{i=1}^{n}\beta\bm{J}(i)\leq 1$ . We assume that each node in $V\setminus\{u_{\mbox{\scriptsize s}}\}$ is equipped with a queue. The number of packets in the queue at $u$ at time $t$ is denoted by $Q_{t}(u)$ .

Packets appear in the system at the source nodes $v\in V_{s}$ which receive external packet arrivals as an independent Bernoulli process with rate $\beta\bm{J}(v)$ . The packet received externally is placed in the queue at $v$ . Packet movement at time $t$ takes place as follows: For each $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ , if $Q_{t}^{\beta}(u)>0$ a single data packet is picked at random from the queue and sent to $v$ with probability $w_{uv}/\mbox{deg}(u)$ . So, each node sends at most one packet from its queue in one time step and may receive multiple packets, up to one from each neighbour. A packet is removed from the system when a neighbor of $u_{\mbox{\scriptsize s}}$ decides to transmit that packet to $u_{\mbox{\scriptsize s}}$ .

In the following we will refer to the $|V|-1$ -dimensional Markov chain $\left\{Q^{\bm{J},\beta}_{t}\right\}_{t\geq 0}$ as the Data Collection Process on $G$ with relative rate vector $\bm{J}$ and rate parameter $\beta$ . Mostly we will omit $\bm{J}$ from the superscript since it will be understood. Occasionally we will maintain $\beta$ in the superscript but dispense with it when it is understood.

2.2 Ergodicity is a critical phenomenon for the Data Collection process

The Data Collection process has two distinct regimes, one ergodic and one non-ergodic, as we vary $\beta$ and there is a sharp transition between them. We find that there is a non-trivial $\beta^{*}>0$ such that the chain $\left\{Q^{\bm{J},\beta}_{t}\right\}_{t\geq 0}$ is ergodic for $\beta$ below this value and converges to a stationary distribution. Above $\beta^{*}$ the system displays drift and the queue sizes grow unbounded as $t\rightarrow\infty$ . Specifically we show the following theorem:

Theorem 1.

Consider a weighted undirected graph $G=(V,E,w)$ and a relative rate vector $\bm{J}$ with $\bm{J}(v)<0$ for exactly one $v\in V$ . If the random walk on $G$ with transition matrix $P_{w}$ where $P_{w}[u,v]=w_{uv}/\mbox{deg}(u)$ is irreducible and aperiodic then there exists a $\beta^{*}>0$ such that the resulting multi-dimensional Markov chain $\left\{Q^{\bm{J},\beta}_{t}\right\}_{t\geq 0}$ is ergodic for all $\beta<\beta^{*}$ and non-ergodic for all $\beta\geq\beta^{*}$ .

Although it is difficult to prove ergodicity results for multi-dimensional Markov chains in general, we show in Section 3 how the induction-based technique developed by Georgiadis and Szpankowski [7], and later summarized by Szpankowski in his study of slotted ALOHA [28], can be applied to prove this result.

2.3 A lower bound on the critical rate

When $\beta<\beta^{*}$ the Data Collection process is ergodic and has a stationary distribution so we can define $\bm{\eta}^{\beta}(v)=\lim_{t\rightarrow\infty}P\left[Q_{t}^{\beta}(v)>0\right]$ for all $v\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ . We will show in Section 4 that at stationarity the vector $\bm{\eta}$ extended to $u_{\mbox{\scriptsize s}}$ by setting $\bm{\eta}(u_{\mbox{\scriptsize s}})=0$ is a solution a linear system because

[TABLE]

where $P_{w}$ is the transition matrix of the random walk defined on $G$ by the weight function $w$ . In Section 4.1 we discuss the relationship of this system to the Laplacian of $G$ and the implications of this relationship. For now, we state one important consequence of this relationship: a lower bound on $\beta^{*}$ .

Theorem 2.

Suppose we have a Data Collection process with relative rate vector $\bm{J}$ such that $\bm{J}(v)<0$ only for $v=u_{\mbox{\scriptsize s}}$ , defined on a graph $G=(V,E,w)$ that satisfies the conditions of Theorem 1 and has critical rate $\beta^{*}$ . Then if $P_{w}$ is the transition matrix of the random walk defined by $w$ on $G$ and $\lambda_{2}^{w}$ is the second largest eigenvalue of $P_{w}$ then

[TABLE]

2.4 Geometric Ergodicity

We show that when $\beta<\beta^{*}$ the Data Collection process converges to its stationary distribution at a geometric rate, i.e., the process is geometrically ergodic. Following Meyn and Tweedie [22], we define geometric ergodicity formally:

Definition 1 (Geometric ergodicity).

Given an irreducible and aperiodic Markov chain $\Phi$ defined on state space $\mathcal{X}$ with transition probability $\mathcal{P}[\cdot,\cdot]$ and stationary distribution $\pi$ , the chain is said to be geometrically ergodic if there exist constants $\rho<1$ , $R>0$ , and, for every state $\bm{x}\in\mathcal{X}$ there exists a $C_{\bm{x}}<\infty$ , such that for all $t>0$ ,

[TABLE]

We use the coupling method to prove that convergence happens at a geometric rate. The convergence rate is in terms of the hitting time, $t_{\mbox{\scriptsize hit}}$ , of the random walk $P_{w}$ defined on $G$ so we provide a definition of this quantity. If $\{X_{t}\}_{t\geq 0}$ is a random walk on $G$ and $\tau_{v}=\min\{t:X_{t}=v\}$ , then $t_{\mbox{\scriptsize hit}}=\max_{u,v\in V}{\mbox{E}\left[\tau_{v}\mid X_{0}=u\right]},$ i.e., the maximum over all pairs $(u,v)$ of vertices of the expected time taken for a random walk begun at $u$ to first reach the vertex $v$ . We show the following convergence theorem.

Theorem 3.

Consider $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ defined on $G=(V,E,w)$ such that there is a critical $\beta^{*}$ as described in Theorem 1. Let $\beta=\beta^{*}(1-\delta)$ for $\delta\in(0,1)$ and denote by $\mathcal{P}$ the transition matrix for the resulting multi-dimensional Markov Chain. Suppose we have $\bm{x},\bm{y}\in(\mathbb{N}\cup\{0\})^{|V|-1}$ with $\sum_{i=1}^{|V|-1}\bm{x}(i)=N^{(\bm{x})},\sum_{i=1}^{|V|-1}\bm{y}(i)=N^{(\bm{y})}$ . Then

[TABLE]

Convergence to stationarity can be derived as a special case of Theorem 3 by choosing $\bm{y}\in(\mathbb{N}\cup\{0\})^{|V|-1}$ according to the $\pi$ , the stationary distribution of chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ . This establishes the geometric ergodicity of the Data Collection process in the subcritical regime.

Corollary 1.

Consider the multi-dimensional Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ with $\beta=\beta^{*}(1-\delta)$ for $\delta\in(0,1)$ as defined in Theorem 3 and denote its stationary distribution by $\pi$ . For $\bm{x}\in(\mathbb{N}\cup\{0\})^{|V|-1}$ such that $\sum_{i=1}^{|V|-1}\bm{x}(i)=N^{(\bm{x})}$ ,

[TABLE]

Moreover, for the special case that $\bm{x}=\bm{0}$ , i.e., the system begins with empty queues, the Markov chain mixes to within $1/M$ of its stationary distribution in terms of total variation distance for any parameter $M>0$ in time $t$ that is $\Theta\left(\frac{t_{\mbox{\scriptsize hit}}\log M}{\delta}\right)$ .

3 Ergodicity as a critical phenomenon

In this section, we prove existence of a non-trivial critical data rate $\beta^{*}$ for the multi-dimensional Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ associated with the Data Collection process such that the chain is ergodic for all values below $\beta^{*}$ and non-ergodic above it.

For a given a Data Collection process on a network modeled by an undirected graph $G=(V,E,w)$ , there is an associated $\lvert V\rvert-1$ -dimensional vector $Q_{t}^{\beta}$ where each $Q_{t}^{\beta}(u)$ represents the queue size at a given node $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ given a data rate $\beta$ . Since the Data Collection process is a queueing system, the question of stability arises, i.e., we need to understand whether the system is able to successfully transfer data at a given value $\beta$ which is the controlling parameter for the rate at which packets appear in the system. For this, following Loynes [18] and Szpankowski [28], we formally define a notion of a stable data rate as follows.

Definition 2 (Stable rate).

Given a weighted undirected graph $G=(V,E,w)$ and a relative rate vector $\bm{J}$ with $\bm{J}(v)<0$ for exactly one $v\in V$ , the process $Q^{\bm{J},\beta}_{t}$ is said to be stable and a value $\beta\geq 0$ of the rate parameter is said to be a stable rate if

[TABLE]

where $F(x)$ is the limiting distribution function.

However, if a weaker condition holds i.e.,

[TABLE]

the process is said to be substable and otherwise unstable. So, a stable process is necessarily substable and for a substable process to be stable its distribution function should tend to a limit. Thus, by stability we mean the distribution of $Q^{\bm{J},\beta}_{t}$ as $t\rightarrow\infty$ exists. Moreover, if the limiting distribution is a stationary distribution, then the process is ergodic. So, for queueing systems ergodicity and stability can be used interchangeably.

In general, proving ergodicity of multi-dimensional Markov chain is difficult, however, for the Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ corresponding to our stochastic Data Collection process we can easily prove it. This is because this process is part of a class of multi-queue systems for which Szpankowski and others showed a general method for proving the existence of a “stability region” of this kind [28]. Building on the work of Malyšev [20] on two-dimensional Markov chains as extended by Malyšev and Menšikov [21] to multi-dimensional chains, Georgiadis and Szpankowski developed an induction-based technique to characterize the stability region of the multi-queue system described by token passing rings [7]. After applying this technique to several related systems, Szpankowski noted in his study of slotted ALOHA [28] that all the systems amenable to this technique had certain properties. We will first discuss that general characterization (properties) and then show that the Data Collection process falls within it.

Given a multi-queue process $N_{t}$ with a set of $M$ queues. Let us consider a partition of $M$ , $\mathscr{P}=(P,U)$ where $P$ refers to the set of persistent users which can transmit dummy packets even when their queues are empty and $U$ refers to the set of non-persistent users which behave as having normal queues. Now, for the given partition $\mathscr{P}$ , let us define a modified multi-queue process $\bar{N}_{t}^{\mathscr{P}}$ wherein the queues in $P$ are never allowed to become empty and queues in $U$ behave similar to those in original process $N_{t}$ . To characterize the stability region of such processes like $N_{t}$ , Szpankowski’s induction-based technique requires three conditions to hold:

Monotonicity. The queues in the modified process are always longer due to the persistent users, i.e., $N_{t}\preceq_{\mbox{\tiny SD}}\bar{N}_{t}^{\mathscr{P}}$ . 2. 2.

Stationarity of $U$ . Since, the users in set $U$ mimic the original process, the transmissions from $U$ that enter $P$ should form a stationary and ergodic sequence so that Loynes’ scheme for one-dimensional queues [18] can be applied to establish the stationarity of a persistent queue (in order to perform the induction step). 3. 3.

Identical behaviors when non-empty. $N_{t}$ and $\bar{N}_{t}^{\mathscr{P}}$ behave identically as long as their queues are non-empty. Only when $N_{t}(u)$ empties for some $u\in M$ and $\bar{N}_{t}^{\mathscr{P}}(u)$ is non-empty for $u\in P$ , they behave differently.

Szpankowski’s general characterization is primarily based on an intrinsic coupling between the two processes $N_{t}$ and $\bar{N}_{t}^{\mathscr{P}}$ as indicated by the first and third property. In this coupling, starting from same initial state the transmission decisions are followed in the two processes i.e., if one process makes a transmission decision then the same decision is followed in the other, so that the trajectories of the two processes are coupled. Note that even if any queue in one of the processes is empty and the corresponding queue in the other is non-empty, any transmission decision of the latter will still be followed by the former although due to empty queue it will have no effect on its queue size or that of its neighbors. To show that the Data Collection process on graph $G=(V,E,w)$ also falls within the domain of this general characterization, we will also use different variations of this coupling for the corresponding Markov chain $Q_{t}^{\beta}$ over space $\{0,1\}^{V\times\mathbb{N}}\times\{0,1\}^{E\times\mathbb{N}}$ .

To start with, using coupling based argument we will first prove an interesting property about the Data Collection process and its corresponding multi-dimensional Markov chain which satisfies Szpankowski’s first condition about the monotonicity. In particular, we will show that for the Markov chain $Q_{t}^{\beta}$ , the queue occupancy probability of a node $P\left[Q^{\beta}_{t}(u)>0\right]$ is an increasing function of $\beta$ for all $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ and it is continuous for all $\beta<\beta^{*}$ where $\beta^{*}$ is the critical rate above which the queues are unstable and below which they are stable.

Lemma 1.

Given an undirected graph $G=(V,E,w)$ running a Data Collection process. Let $Q^{\beta}_{t}$ represent the queues at time $t$ for all nodes $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ . Then, for all such nodes $P\left[Q^{\beta}_{t}(u)>0\right]$ is

an increasing function of $\beta$ , and 2. 2.

continuous for all $\beta<\beta^{*}$ where $\beta^{*}$ is the critical data rate such that all data rates $\beta<\beta^{*}$ are stable and $\beta\geq\beta^{*}$ are unstable.

Proof.

(1). To prove this property, we will first establish that the multi-dimensional Markov chain $Q_{t}^{\beta}$ is stochastically ordered i.e., stochastically larger initial states will produce stochastically larger chains at all times. For this, let us consider a coupling as used by Szpankowski of two trajectories of this chain $\{Q_{t}^{\beta}\}$ and $\{\bar{Q}_{t}^{\beta}\}$ such that $\bar{Q}_{0}^{\beta}\preceq_{\mbox{\tiny SD}}Q_{0}^{\beta}$ . Now, assume the stochastic dominance relation between the two holds at time $t$ i.e., $\bar{Q}_{t}^{\beta}\preceq_{\mbox{\tiny SD}}Q_{t}^{\beta}$ . Then, at time step $t+1$ for both $Q_{t}^{\beta}$ and $\bar{Q}_{t}^{\beta}$ from the one-step basic queue evolution equation at all nodes $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ we have

[TABLE]

where $A_{t}(u)$ is the number of packets generated at $u$ , which is 0 if $u\notin V_{s}$ and is 1 with probability $\beta\mathbf{J}(v)$ if $v\in V_{s}$ , so, ${\mbox{E}\left[A_{t}(u)\right]}=\beta\mathbf{J}(u)$ . Now consider any node $u$ at time $t+1$ , from the induction hypothesis queues at node $u$ as well as its neighbors in $Q_{t}^{\beta}$ will dominate over the ones in $\bar{Q}^{\beta}_{t}$ , so the first three terms on the right of Eq. (7) in $Q_{t}^{\beta}(u)$ will dominate the ones for $\bar{Q}^{\beta}_{t}(u)$ and since $\beta$ is same, the last term is same for both cases. So, we have $\bar{Q}_{t+1}^{\beta}(u)\preceq_{\mbox{\tiny SD}}Q_{t+1}^{\beta}(u)$ . This is true for all nodes $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ , so we have at time $t+1$ , $\bar{Q}_{t+1}^{\beta}\preceq_{\mbox{\tiny SD}}Q_{t+1}^{\beta}$ . Hence, by induction the Markov chain $Q_{t}^{\beta}$ is stochastically ordered.

Now to prove monotonicity, for $\beta<\beta^{\prime}$ let us consider a coupling similar to the one used before of two stochastically ordered Markov chains $Q_{t}^{\beta}$ and $Q_{t}^{\beta^{\prime}}$ such that $Q_{0}^{\beta}\preceq_{\mbox{\tiny SD}}Q_{0}^{\beta^{\prime}}$ . Then, as we know for all $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ , $\beta\bm{J}(u)<\beta^{\prime}\bm{J}(u)$ , so by using induction and evolving queues using one-step queue evolution equation (Eq. (7)), we can show that $Q_{t}^{\beta}\preceq_{\mbox{\tiny SD}}Q_{t}^{\beta^{\prime}}$ for all $t$ . Hence, by induction we have $P\left[Q^{\beta}_{t}(u)>0\right]$ is an increasing function of $\beta$ for all $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ .

(2). To prove the continuity of the given function for $\beta<\beta^{*}$ , we will again consider a similar coupling, however between two stochastically ordered Markov chains $Q_{t}^{\beta}$ and $Q_{t}^{\beta-d\beta}$ with infinitesimal $d\beta$ . For the data generation rule in the two chains, we have whenever new data packet is generated at any node in $Q^{\beta-d\beta}_{t}$ chain then, it is definitely generated at the corresponding node in $Q^{\beta}_{t}$ chain but not vice-versa. To understand the difference in the two chains, let $N^{\beta-d\beta}_{t}$ and $N^{\beta}_{t}$ denote the total number of packets in the respective chains till time $t$ and $\Lambda_{t}^{\beta}=N^{\beta}_{t}-N^{\beta-d\beta}_{t}$ . Now, consider $g:[0,1]\rightarrow\mathbb{R}$ to be a function dependent on $\beta$ such that $g(\beta)={\mbox{E}\left[Q_{t+1}^{\beta}(u)-Q_{t}^{\beta}(u)\right]}$ which is bounded by definition. So, if we look at the derivative of this function, the term where $\Lambda_{t}^{\beta}=0$ will be zero by definition of coupling, as the two chains behave differently only when there is an extra generated packet. Similarly, terms with $\Lambda_{t}^{\beta}\geq 2$ will have higher powers of $d\beta$ which will become zero as $d\beta\rightarrow 0$ . Hence, the derivative $g^{\prime}(\beta)$ only depends on $\Lambda_{t}^{\beta}=1$ term i.e.,

[TABLE]

where $V_{s}\subset V$ is the set of data sources. So, the total number of data packets generated in the two Markov chains upto time $t$ differ by one and hence, the queues at nodes in the two chains differ by at most one data packet at any time step. Now, for the given coupled chains let $t^{\prime}$ be the time by which an extra packet is generated in chain $Q_{t}^{\beta}$ . So, we have,

[TABLE]

where $P_{t^{\prime}}$ is the probability that the extra packet generated in chain $Q_{t}^{\beta}$ is present at node $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ . This means

[TABLE]

So, if $P\left[Q_{t}^{\beta}(u)>0|Q^{\beta-d\beta}_{t}(u)=0\right]$ is defined, as, $d\beta\rightarrow 0$ from the above equation we have, $P\left[Q_{t}^{\beta}(u)>0\right]-P\left[Q_{t}^{\beta-d\beta}(u)>0\right]\rightarrow 0$ . Similarly, for the other side if $P\left[Q^{\beta+d\beta}_{t}(v)>0|Q_{t}^{\beta}(u)=0\right]$ is defined, so as $d\beta\rightarrow 0$ , similar to Eq.(9) we have, $P\left[Q_{t}^{\beta+d\beta}(u)>0\right]-P\left[Q_{t}^{{\beta}}(u)>0\right]\rightarrow 0$ . Now, if both these conditions are true then the function is continuous as it has both left and right continuity respectively.

Now, consider all data rates $\beta<\beta^{*}$ where $\beta^{*}$ is the critical rate below which all rates are stable and above which all are unstable. So, for such rates both the probabilities $P\left[Q_{t}^{\beta}(u)>0|Q^{\beta-d\beta}_{t}(u)=0\right]$ and $P\left[Q^{\beta+d\beta}_{t}(v)>0|Q_{t}^{\beta}(u)=0\right]$ are defined, so as discussed above the function is continuous on both sides for all $\beta<\beta^{*}$ . Now consider the case of data rates $\beta\geq\beta^{*}$ . At $\beta^{*}$ , we know $P\left[Q_{t}^{\beta^{*}}(u)>0\right]-P\left[Q_{t}^{\beta^{*}-d\beta}(u)>0\right]$ is defined (see Eq. (8)), as rate $\beta^{*}-d\beta$ is stable by definition, hence, the function is left continuous for this rate. However, for the other side since we know $\beta^{*}$ is not stable i.e., $\lim_{t\rightarrow\infty}P\left[Q^{\beta^{*}}_{t}(v)=0\right]=0$ , hence, $P\left[Q^{\beta^{*}+d\beta}_{t}(v)>0|Q_{t}^{\beta^{*}}(u)=0\right]$ will not be defined and function is not right continuous. So, for $\beta\geq\beta^{*}$ function is left continuous but not right continuous. However, for all $u\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ , $P\left[Q^{\beta}_{t}(u)>0\right]$ is a continuous function (both limits exist) for all $\beta<\beta^{*}$ .

∎

Having satisfied Szpankowski’s first condition of monotonicity, we shall use two other general results to characterize the stability region of the multi-dimensional Markov chain associated with the Data Collection process. In particular, we will use Szpankowski’s “isolation lemma” (Lemma 2) and Loynes’ scheme [18] as adapted to our situation (Lemma 3).

Lemma 2 (Szpankowski [27]).

Given $N_{t}=(N_{t}^{1},N_{t}^{2},\cdots,N_{t}^{M})$ , an $M$ -dimensional Markov chain.

If it is defined on a countable state space, then the stability of $N_{t}^{j}$ for all $j\in M$ implies the stability of the multi-dimensional Markov chain $N_{t}$ . 2. 2.

If for some $j$ , say $j^{*}$ , $N_{t}^{j^{*}}$ is unstable, then $N_{t}$ is also unstable.

Lemma 3 (Loynes [18]).

Given a pair $(X_{t}^{j},Y_{t}^{j})$ of a strictly stationary and ergodic process, let $U_{t}^{j}=X_{t}^{j}-Y_{t}^{j}$ . Then, the following holds:

If ${\mbox{E}\left[U_{t}^{j}\right]}<0$ , then $N_{t}^{j}$ is stable. 2. 2.

If ${\mbox{E}\left[U_{t}^{j}\right]}>0$ , then $N_{t}^{j}$ is unstable and $\lim_{t\rightarrow\infty}N_{t}^{j}=\infty$ (a.s.).

Using these tools and Szpankowski’s general method we will now prove the existence of a non-trivial stability region for the multi-dimensional Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ corresponding to a Data Collection process defined on an undirected graph $G=(V,E,w)$ .

Proof of Theorem 1.

We first proceed by proving the sufficient part i.e., existence of a non-trivial $\beta^{*}>0$ such that the multi-dimensional Markov chain is ergodic for all $\beta<\beta^{*}$ and then the necessary part of the argument i.e., for all $\beta\geq\beta^{*}$ the chain is non-ergodic.

Sufficiency.

Given a partition $(P,U)$ of $V\setminus\{u_{\mbox{\scriptsize s}}\}$ queues we define a modification of $|V|-1$ -dimensional chain $Q_{t}^{\beta}$ represented as $\bar{Q}_{t}^{\beta,U}$ where all nodes in $U$ have the same behavior as in $Q_{t}^{\beta}$ but the nodes in $V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus U$ are not allowed to have empty queues. Let us now first set $U=\emptyset$ (non-persistent users) and $P=V\setminus\{u_{\mbox{\scriptsize s}}\}$ (persistent users). For any $\beta\in(0,1)$ , we know the one step basic queue evolution equation under the Data Collection process for any $u$ is as follows.

[TABLE]

So, at each node $u$ we have an arrival from $v$ with probability $P_{w}[v,u]$ in $\bar{Q}_{t}^{\beta,\emptyset}$ since the queue of $v$ is always non-empty and the departure is the usual $\sum_{v:v\sim u}P_{w}[u,v]$ .

Now, since we know $P_{w}[u_{\mbox{\scriptsize s}},v]=0$ for all $v\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ , so the sum of the outgoing probabilities from $V\setminus\{u_{\mbox{\scriptsize s}}\}$ is greater than the sum of the incoming probabilities, i.e., $\sum_{u\in V\setminus\{u_{\mbox{\scriptsize s}}\}}\sum_{v:v\sim u}P_{w}[u,v]>\sum_{u\in V\setminus\{u_{\mbox{\scriptsize s}}\}}\sum_{v:v\sim u,v\in V\setminus\{u_{\mbox{\scriptsize s}}\}}P_{w}[v,u].$ Therefore, there must be a vertex $u^{*}\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ for which $\sum_{v:u^{*}\sim v}P_{w}[u^{*},v]>\sum_{v:u^{*}\sim v,v\in V\setminus\{u_{\mbox{\scriptsize s}}\}}P_{w}[v,u^{*}]$ . So, from Eq. (7) for this $u^{*}$ we note that the expected drift is

[TABLE]

which is negative for an appropriately small but non-zero value of $\beta$ , let’s call it $\beta_{u^{*}}$ .

Now, to apply Loynes’ scheme to vertex $u^{*}$ we need to ensure that the sequence $(I_{t}(u^{*}),O_{t}(u^{*}))$ is strictly stationary where $I_{t}(u^{*})$ is the number of incoming packets to $u^{*}$ at time $t$ and $O_{t}(u^{*})$ is the number of outgoing packets from $u^{*}$ . Since all nodes $u\in P$ , so $u^{*}$ as well as its neighbors always have a packet in the queue, so, both $O_{t}(u^{*})$ and $I_{t}(u^{*})$ are sequences of independent Bernoulli random variables and hence are stationary and ergodic. So, we can apply Loynes’ scheme (Lemma 3) to claim that the one-dimensional process $\bar{Q}^{\beta_{u^{*}},\emptyset}_{t}(u^{*})$ is stable, and, hence, $Q^{\beta_{u^{*}}}_{t}(u^{*})$ is stable.

Now, we assume there is a non-empty set $U$ of non-persistent users and a $\beta_{U}>0$ such that $\bar{Q}_{t}^{\beta_{U},U}(U)$ is stable and has a stationary distribution. To apply Loynes’ scheme to a vertex, $u\in P=V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus U$ we need to ensure that the sequence $(I_{t}(u),O_{t}(u))$ is strictly stationary. Since $u\in P$ there is always a packet in the queue at $u$ and so $O_{t}(u)$ is a sequence of independent Bernoulli random variables which take value 1 with probability $\sum_{v:v\sim u}P_{w}[u,v]$ and 0 otherwise. We decompose $I_{t}(u)$ as the sum 0-1 random variables $A^{uv}_{t}$ , where $A^{uv}_{t}=1$ if $u$ receives a packet from $v$ at time $t$ . Then

[TABLE]

Since all $v\in P$ have a packet in their queue at all $t\geq 0$ , each $\sum_{v\in P}A^{uv}_{t}$ is the sum of Bernoulli random variables and hence taken from a strongly stationary sequence. If we start the $\bar{Q}_{t}^{\beta_{U},U}$ from an initial state picked according to this stationary distribution which ensures that the process stays in the stationary state for all $t\geq 0$ . In particular, this implies that for any $v\in P$ , number of incoming packets from $v$ at time $t\geq 0$ is a sequence of random variables that is strongly stationary. Therefore $(I_{t}(u),O_{t}(u))$ is a strongly stationary sequence and we can apply Loynes’ scheme. The expected drift at time $t\geq 0$ at any $u\in P$ for any $\beta\leq\beta_{U}$ is given by

[TABLE]

Since the graph is connected and so there is at least one pair $(w_{1},w_{2})$ such that $w_{1}\in U,w_{2}\in P$ and $P_{w}[w_{1},w_{2}]>0$ , therefore we know that $\sum_{u\in P}\sum_{v\sim u}P_{w}[u,v]>\sum_{u\in P,v\in U}\sum_{v\sim u}P_{w}[v,u].$ This means that there is a $u^{*}\in P$ such that $\sum_{u^{*}\sim v}P_{w}[u^{*},v]>\sum_{u^{*}\sim v,v\in P}P_{w}[v,u^{*}].$ For this $u^{*}$ the first two terms in Eq. (10) add up to a value which is negative. Further from Lemma 1 we note that the third term is continuous and increasing in $\beta$ and tends to 0 as $\beta\downarrow 0$ . Hence, it is possible to find a value $\beta_{U\cup\{u^{*}\}}$ which lies in $(0,\beta_{U})$ such that the expected drift is negative. So, from Loynes’ scheme (Lemma 3) this implies that $\bar{Q}^{\beta,U}_{t}(U\cup\{u^{*}\})$ is stable for $\beta<\beta_{U\cup\{u^{*}\}}$ . Moreover, from Lemma 2 since the stability of all the one-dimensional Markov Chains associated with the vertices in $U\cup\{u^{*}\}$ implies the stability of the overall multi-dimensional chain. Consequently, the same holds for $Q_{t}^{\beta}(U\cup\{u^{*}\})$ . Therefore by induction there is a $\beta^{*}$ such that for $\beta<\beta^{*}$ , $Q_{t}^{\beta}$ is stable.

Necessity.

Corresponding to the sequence by which the stability region is expanded to include all the vertices of $V\setminus\{u_{\mbox{\scriptsize s}}\}$ there is a sequence $\beta_{u_{1}},\beta_{u_{2}},\ldots,\beta_{u_{|V\setminus\{u_{\mbox{\scriptsize s}}\}|}}$ such that $\beta^{*}=\min\{\beta_{u_{1}},\beta_{u_{2}},\ldots,\beta_{u_{|V\setminus\{u_{\mbox{\scriptsize s}}\}|}}\}.$ Let $w$ be the vertex for which $\beta_{w}=\beta^{*}$ . Assume for the sake of simplicity of presentation that $\beta_{w}<\min\{\beta_{u}:u\in V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}\}.$ Hence we can choose any $\beta$ such that $\beta_{w}<\beta<\min\{\beta_{u}:u\in V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}\}.$ For this $\beta$ we know that $Q_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\})$ is stable. If we start this chain from its stationary distribution then the number of packets that are transmitted from $V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}$ to $w$ form a strongly stationary sequence. Since $w$ is persistent in this setting the packets leaving it are also strongly stationary. Hence Loynes’ scheme (Lemma 3) can be applied. By the choice of $\beta$ we know that the expected drift at $w$ is strictly positive and so $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(w)$ is unstable and hence by Lemma 2, $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}$ is unstable.

In order to show that $Q_{t}^{\beta}$ is also unstable for this choice of $\beta$ we will show there is a coupling of $Q_{t}^{\beta}$ and $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}$ with an appropriately chosen initial condition such that the two models behave exactly similarly. We know on the set of sample paths (of positive probability) on which the queue at $w$ remains strictly positive the two coupled models behave exactly similarly because the difference only arises if the queue at $w$ becomes 0 at time $t$ , in which case $\bar{Q}_{t+1}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(w)$ is automatically set to 1 since $w$ is persistent and $Q^{\beta}_{t+1}(w)$ remains 0. Now, we know that $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(w)$ is unstable, so when we start $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\})$ according to its stationary distribution and we set the queue at $w$ to 1, there is positive probability that this queue never reaches 0. So, for those cases $Q_{t}^{\beta}(w)$ behaves similarly as $\bar{Q}_{t}^{\beta,V\setminus\{u_{\mbox{\scriptsize s}}\}\setminus\{w\}}(w)$ i.e., it is unstable. Therefore with these initial conditions $Q^{\beta}_{t}(w)$ is not substable since with positive probability $\lim_{t\rightarrow\infty}P\left[Q^{\beta}_{t}(w)>m\right]$ , for all finite $m$ . Hence, $Q^{\beta}_{t}(w)$ is unstable and, by Lemma 2, $Q^{\beta}_{t}$ is unstable for our choice of $\beta$ and, by the monotonicity of the process (see Lemma 1), it is unstable for all choices of $\beta\geq\beta^{*}$ . ∎

Having established the existence of a non-trivial critical data rate for the Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ of Data Collection process below which the chain is ergodic and above which it is non-ergodic, we will now characterize this critical rate.

4 Characterizing the critical rate

In section 3, we proved the ergodicity of Markov chain associated with the Data Collection process and showed that its stationary distribution exists. Now, in this section we will show that at steady-state Data Collection process is same as a special class of Linear equations which we call as the “one-sink” Laplacian system. Using this equivalence we will derive a lower bound on the critical rate. We will also discuss some common topologies in context of this result and show some tight examples. Lastly, we will also present an upper bound on the critical rate.

4.1 Equivalence to one-sink Laplacian systems

The basic one step queue evolution equation under the Data Collection process for any node $u\in V$ is as follows.

[TABLE]

where the second and third term on the right-hand side of the above equation represents the transmissions sent to and received from the neighbors respectively and $A_{t}(u)$ is the number of packets generated at $u$ , which is 1 with probability $\beta\bm{J}(u)$ if $u\in V_{s}$ , for the sink $\beta\bm{J}(u_{\mbox{\scriptsize s}})=-\beta\sum_{v\in V_{s}}\mathbf{J}(v)$ , and for all other nodes $\bm{J}(u)=0$ , where $u\notin\{V_{s}\cup\{u_{\mbox{\scriptsize s}}\}\}$ . Now, taking expectations on both sides of Eq. (11) and let $\bm{\eta}^{\beta}_{t}(u)=P\left[Q_{t}^{\beta}(u)>0\right]$ be the queue occupancy probability of node $u$ and observing that ${\mbox{E}\left[A_{t}(u)\right]}=\beta\mathbf{J}(u)$ , where $\bm{J}$ is the relative rate vector, we have

[TABLE]

From Theorem 1, we know that for an appropriately chosen value of $\beta$ the Data Collection process has a steady state. Moreover, at steady state ${\mbox{E}\left[Q_{t}^{\beta}(u)\right]}$ is a constant, so if we let $\bm{\eta}^{\beta}(u)=\lim_{t\rightarrow\infty}P\left[Q_{t}^{\beta}(u)>0\right]$ be the queue occupancy probability of node $u$ at the stationarity, then we have the steady-state equation for the given node as

[TABLE]

We can also represent the steady-state equations of all $|V|=n$ nodes in matrix form as follows. For this, let us first order the nodes such that the $n$ th node represents the sink. Let $\bm{\eta}$ be an $n$ element column vector representing the steady-state queue occupancy probability $\bm{\eta}^{\beta}(u)$ of nodes $u\in V$ . We drop the subscript $\beta$ where the rate is understood from the context. So, we have $\bm{\eta}=[\bm{\eta}(1)\leavevmode\nobreak\ \bm{\eta}(2)\leavevmode\nobreak\ \cdots\leavevmode\nobreak\ \bm{\eta}(n-1)\leavevmode\nobreak\ 0]$ . This is defined assuming that sink collects all data it receives and has no notion of maintaining queue. Let $\bm{J}$ be another $n$ element column vector such that $\bm{J}(i)>0$ if $i\in V_{s}$ , $\bm{J}(u_{\mbox{\scriptsize s}})=-\sum_{i\in V_{s}}\bm{J}(i)$ and 0 elsewhere, and $I$ be the usual $n\times n$ identity matrix. So, given the transition matrix $P_{w}$ for the random walk defined by $w$ on graph $G$ , the steady-state queue equations at the nodes can be written in matrix form as

[TABLE]

As we know transition matrix $P_{w}=D^{-1}A$ where $D$ is the diagonal matrix of generalized degrees and $A$ is the adjacency matrix, so matrix $(I-P_{w})$ is also a Laplacian as we can rewrite it as $(I-P_{w})=D^{-1}(D-A)=D^{-1}L$ . So, the above equation (Eq. (14)) can be rewritten as

[TABLE]

where $\bm{x}^{T}=\bm{\eta}^{T}D^{-1}$ is a row vector such that $\bm{x}(u)=\bm{\eta}(u)/\mbox{deg}(u)$ for all $u$ where $\bm{\eta}(u)$ is the steady-state queue occupancy probability and $\mbox{deg}(u)=\sum_{v:(u,v)\in E}w_{uv}$ is the generalized degree of node $u$ . Eq. (15) is similar to Laplacian systems of the form $L\mathbf{x}=\mathbf{b}$ with a constraint that only one element in $\mathbf{b}$ is negative. We call such systems “one-sink” Laplacian systems. In our subsequent work [8] we discuss this connection in detail.

4.2 A lower bound

Now having established the steady-state equation for the Data Collection process, we will use it for characterising the critical data rate. In particular, we will prove a lower bound on such rate.

Proof of Theorem 2.

For a given graph $G=(V,E,w)$ , with source set $V_{s}\subseteq V\setminus\{u_{\mbox{\scriptsize s}}\}$ and transition matrix $P_{w}$ for random walk defined by $w$ on graph $G$ , recall that the steady-state queue equations at nodes can be written in vector form as

[TABLE]

Now, in order to bound the maximum stable data rate $\beta$ at which the source nodes generate data in terms of the underlying graph parameters, we will consider eigendecomposition of the left hand side of Eq. (16). For this, we will deviate from the usual inner product on the vector space $\mathbb{R}^{V}$ i.e., $\langle f,g\rangle=\sum_{x\in V}f(x)g(x)$ and define another inner product on $\mathbb{R}^{V}$ which is given by $\langle f,g\rangle_{\mu}:=\sum_{x\in V}f(x)g(x)\mu(x)$ where $\mu$ is the stationary distribution of random walk defined by $w$ on graph satisfying $\mu=\mu P_{w}$ . From Lemma 12.2 [16], it is known that the inner product space $(\mathbb{R}^{V},\langle\cdot,\cdot\rangle)_{\mu}$ has an orthonormal basis of real-valued eigenfunctions $\{f_{j}\}_{j=1}^{|V|}$ corresponding to real eigenvalues $\{\lambda_{j}\}$ . Using this lemma and writing the vector $\bm{\eta}^{T}$ in terms of the eigenvectors, we have $\bm{\eta}^{T}=\sum_{i=1}^{|V|}\langle\bm{\eta}^{T},f_{i}\rangle_{\mu}f_{i}.$ This gives us that $\bm{\eta}^{T}(I-P_{w})=\sum_{i=1}^{|V|}(1-\lambda_{i}^{w})\langle\bm{\eta}^{T},f_{i}\rangle_{\mu}f_{i}$ , where $\lambda_{i}^{w}$ is the $i^{th}$ eigenvalue of transition matrix $P_{w}$ . Moreover, from Lemma 12.1 of [16], we also know that the absolute value of any eignevalue of a transition matrix can be at most $1$ , so, $\lambda_{1}^{w}=1>\lambda_{2}^{w}\geq\cdots\geq\lambda_{n}^{w}$ . So, we have

[TABLE]

Note, that $f_{1},\ldots,f_{|V|}$ form an orthonormal basis so, $\sum_{i=1}^{|V|}\langle\bm{\eta}^{T},f_{i}\rangle_{\mu}^{2}=\lVert\bm{\eta}^{T}\rVert_{\mu}^{2}$ . Hence we have

[TABLE]

The eigenfunction $f_{1}$ corresponding to the eigenvalue 1 can be taken to be a constant vector 1, so $\langle\bm{\eta}^{T},f_{1}\rangle_{\mu}=\sum\limits_{i=1}^{n}{\bm{\eta}(i)\mu(i)}$ , where $\mu(i)=\sum_{v\in V}\mu(v)P_{w}[v,i]$ . Also, $\lVert\bm{\eta}^{T}\rVert_{\mu}^{2}=\sum\limits_{i=1}^{n}{\bm{\eta}^{2}(i)\mu(i)}$ . So, using these results in Eq. (19) we have

[TABLE]

where, $\bar{\bm{\eta}}_{\mu}=\sum\limits_{i=1}^{n}{\bm{\eta}(i)\mu(i)}$ is the expected queue occupancy probability of nodes under stationary distribution $\mu$ . Now, taking square of norm of Eq. (18) and using Eq. (20), we have

[TABLE]

Using Eq. (21) in the square of norm of Eq. (16), we have

[TABLE]

Moreover, as $\sum_{i\in V_{s}}\bm{J}^{2}(i)\leq(\sum_{i\in V_{s}}\bm{J}(i))^{2}$ , so we have

[TABLE]

where $\mu_{m}=\max_{i\in V_{s}}\mu(i)$

Now to get a bound on $\mbox{Var}_{\mu}(\bm{\eta}(i))=\sum\limits_{i=1}^{n}(\bm{\eta}(i)-\bar{\bm{\eta}}_{\mu})^{2}\mu(i)$ , we consider two nodes whose queue occupancy probability we know precisely (1) the sink, $u_{\mbox{\scriptsize s}}$ , which has $\bm{\eta}(u_{\mbox{\scriptsize s}})=0$ (as it has no notion of maintaining queue and it sinks data packets as soon as it receives them), and (2) a node $u_{\max}$ with maximum queue occupancy probability for a given $\beta$ , let it be $\bm{\eta}_{\max}^{\beta}=\max_{u\in V\setminus\{u_{\mbox{\scriptsize s}}\}}\bm{\eta}^{\beta}(u)$ . Now, let $\beta=(1-\delta)\beta^{*}$ where $\beta^{*}$ is the critical data rate and $\delta\in(0,1)$ . From Eq. (14) we know $\bm{\eta}$ is linear in $\beta$ and $\bm{\eta}_{\max}^{\beta*}=1$ , so $\bm{\eta}_{\max}^{\beta}=\frac{\beta}{\beta^{*}}$ and hence, we have $\bm{\eta}_{\max}^{\beta}=1-\delta$ .

We note that the contribution of $u_{\mbox{\scriptsize s}}$ and $u_{\max}$ with $\bar{\bm{\eta}}_{\mu}=\sum_{i=1}^{n}\bm{\eta}(i)\mu(i)$ as the expected queue occupancy probability of nodes under the stationary distribution $\mu$ is as follows.

[TABLE]

where the last inequality holds as $\left(1-\bar{\bm{\eta}}_{\mu})^{2}\mu(u_{\max})+(\bar{\bm{\eta}}_{\mu}-0)^{2}\right)\mu(u_{\mbox{\scriptsize s}})$ achieves optimum at $\bar{\bm{\eta}}_{\mu}=\frac{(1-\delta)\mu(u_{\max})}{\mu(u_{\max})+\mu(u_{\mbox{\scriptsize s}})}$ . So, first using Eq. (23) and Eq. (24) in Eq. (22) and then we know as $\beta\rightarrow\beta^{*}$ , $\delta\rightarrow 0$ , so we have

[TABLE]

Now, we know $\mu(i)=\frac{\mbox{deg}(i)}{\sum_{u\in V}\mbox{deg}(u)}$ , and $\frac{d_{\min}}{\sum_{u\in V}\mbox{deg}(u)}\leq\mu(i)\leq\frac{d_{\max}}{\sum_{u\in V}\mbox{deg}(u)}$ where, $d_{\min}$ and $d_{\max}$ are the generalized minimum and maximum degrees of graph respectively. So using the appropriate bounds on $\mu(i)$ in Eq. (25) we have

[TABLE]

where $\lambda_{2}^{w}$ is the second smallest eigenvalue of the transition matrix of random walk defined by weight function $w$ and $d_{u_{\mbox{\scriptsize s}}}$ is the generalized degree of the sink node. ∎

In Table 1, we present lower bound on the critical data rate for the stochastic Data Collection process. We also present the exact values of data rate which are easy to calculate using elementary algebra for these topologies. In all these cases, we assume that all edges have unit weight $w:E\rightarrow\bm{1}$ i.e., random walk defined by $P_{w}$ is simple random walk, there is only one source node i.e., $|V_{s}|=1$ such that $\sum_{i\in V_{s}}\bm{J}(i)=1$ .

If we consider the complete graph topology it is easy to see that the exact rate is $n/2(n-1)$ . As, the spectral gap of the simple random walk on the complete graph of $n$ nodes is $n/n-1$ , we note that for this case our lower bound is tight i.e., both the exact value and the lower bound have order $\Theta(1)$ . Similarly, for the star graph with sink at outer edge, our lower bound is tight and is of order $\Theta(1/n)$ . Hence it is clear that our lower bound cannot admit any asymptotic improvement in general. On the other hand, consider cycle topology which shows that for specific cases a better lower bound may be possible. We note that our spectral gap-based lower bound is a $\Theta(1/n)$ lower than the exact value for this case. Similarly, for other topologies like wheel graph, complete binary tree and $k$ -times star of star graph ( $n^{1/k}$ -regular tree defined on $k$ levels) a better lower bound is possible.

4.3 An upper bound

We also prove an upper bound on the critical data rate for a special case where $V_{s}=V\setminus\{u_{\mbox{\scriptsize s}}\}$ . In order to present this bound, we need to define some terms. For any vertex $u\in V$ , we define its measure as, $\rho(u):=\sum\limits_{v\in V}P_{w}[u,v]$ . Similarly, for any $U\subset V$ we define the measure $\rho(U)=\sum\limits_{u\in U}\rho(u)$ . We also define the edge boundary as $\partial U:=\{(u,v):u\in U,v\notin U\}$ , so, $\rho(\partial U)=\sum\limits_{u\in U,v\notin U}P_{w}[u,v]$ . We have the following upper bound result.

Proposition 1.

Given a graph $G=(V,E,w)$ with $|V|=n$ nodes out of which there is one sink $u_{\mbox{\scriptsize s}}$ and set $V_{s}=V\setminus\{u_{\mbox{\scriptsize s}}\}$ of source nodes running a Data Collection process having critical data rate $\beta^{*}$ as defined by Theorem 1. To achieve stable queues $\beta^{*}$ must satisfy

[TABLE]

where $P_{w}$ is the transition matrix of random walk defined by $w$ , $\hat{h}(G)=\min\limits_{U\subset V,u_{\mbox{\scriptsize s}}\notin U}\frac{\rho(\partial U)}{\rho(U)}$ is a constant and $\hat{h}(G)$ is at most $h(G)$ , the edge expansion of graph $G$ .

Proof of Proposition 1.

Given any vertex $u\in V$ , recall its measure is defined as, $\rho(u):=\sum\limits_{v\in V}P_{w}[u,v]$ , and for any $U\subset V$ we have $\rho(U)=\sum\limits_{u\in U}\rho(u)$ . Similarly, for edge boundary as $\partial U:=\{(u,v):u\in U,v\notin U\}$ , we have $\rho(\partial U)=\sum\limits_{u\in U,v\notin U}P_{w}[u,v]$ . Now, let us define constants $h(U):=\frac{\rho(\partial U)}{\rho(U)}$ and $\hat{h}(G):=\min\limits_{U\subset V,u_{\mbox{\scriptsize s}}\notin U}h(U)\leq h(G)$ where $h(G)$ is the edge expansion of graph $G$ .

We know, for any given set $U\subset V$ , where $u_{\mbox{\scriptsize s}}\notin U$ the maximum data flow that can move out of this set is the flow across the boundary $\partial U$ , so

[TABLE]

Now, for set $U=V_{s}=V\setminus\{u_{\mbox{\scriptsize s}}\}$ , we have $\hat{h}(G))\leq\sum\limits_{u:u\sim u_{\mbox{\scriptsize s}}}\frac{P_{w}[u,u_{\mbox{\scriptsize s}}]}{n-1}$ . So, from eq. (28) $\beta\leq\sum\limits_{u:u\sim u_{\mbox{\scriptsize s}}}\frac{P_{w}[u,u_{\mbox{\scriptsize s}}]}{n-1}$ . Hence, the upper bound on the critical data rate is given by,

[TABLE]

∎

Note that our derived upper and lower bound on the critical data rate relates directly to the two sides of Cheeger’s inequality [4].

5 Geometric rate of convergence

Next, we characterize the rate of convergence of Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ for the stable regime i.e., $\beta<\beta^{*}$ . In particular, we first prove a general result about the total variation distance between the probability distributions of two Markov chains and their rate of convergence. Then, as a special case of this result we show that the convergence of Markov chain $\left\{Q_{t}^{\bm{J},\beta}\right\}_{t\geq 0}$ is geometric i.e, starting from any initial state, the distance from the stationarity reduces exponentially. Note that we drop the superscript $\bm{J},\beta$ from the Markov chain representation as a stable data rate value for proving the convergence rate is assumed.

Proof of Theorem 3.

We first note that our Markov chain $Q_{t}$ is stochastically ordered (c.f. [19]). In general this means, if we are given two random processes $X$ and $Y$ supported on $\mathbb{N}\cup\{0\}^{|V\setminus\{u_{\mbox{\scriptsize s}}\}|}$ we say $X$ is stochastically dominated by $Y$ if ${\mbox{E}\left[f(X)\right]}\leq{\mbox{E}\left[f(Y)\right]}$ for every increasing function $f$ . For our Data Collection chain we state the stochastic orderedness property as follows.

Claim 1.

Given two instances of the Data Collection process $Q_{t}$ and $Q^{\prime}_{t}$ such that $Q_{0}\preceq Q^{\prime}_{0}$ , $Q_{t}$ is stochastically dominated by $Q^{\prime}_{t},t\geq 0$ . In particular this means that $P\left[Q_{t}(v)>0\right]\leq P\left[Q^{\prime}_{t}(v)>0\right]$ for all $v\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ .

The proof of this claim follows by constructing a coupling between the two chains such that each of them perform exactly the same transmission actions. In case one of the chains is empty then the transmission action is a dummy action. It is easy to see that stochastic ordering follows naturally for the Data Collection chain.

To use this claim, for our irreducible and aperiodic Markov chain $Q_{t}$ described by the Data Collection process defined on $(\mathbb{N}\cup\{0\})^{|V|-1}$ having transition matrix $\mathcal{P}$ and a stationary distribution $\pi$ , let us define two other irreducible and aperiodic Markov chains $Q^{1}_{t}$ and $Q^{2}_{t}$ , each with state space $(\mathbb{N}\cup\{0\})^{|V|-1}$ . Initially, suppose the data is generated in the two chains in a coupled way such that one of them dominates the other i.e., either $Q^{1}_{0}(v)\leq Q^{2}_{0}(v)$ for all $v\in V\setminus\{u_{\mbox{\scriptsize s}}\}$ or vice-versa.

Now, consider the coupling $(Q^{1}_{t},Q^{2}_{t})$ on $(\mathbb{N}\cup\{0\})^{|V|-1}\times(\mathbb{N}\cup\{0\})^{|V|-1}$ defined over random sequences $\{0,1\}\times\{\prod_{v\in V\setminus\{u_{\mbox{\scriptsize s}}\}}\Gamma(v)\}$ where $\Gamma(v)$ is the set of one-step destinations from node $v$ , such that both the chains $Q^{1}_{t}$ and $Q^{2}_{t}$ are populated in a coupled way. Such Markov chains are said to be stochastically ordered chains in the queueing theory and have a property that the Markov chain which dominates the other chain will always maintain dominance over it.

Now, under this coupling we allow the two chains to run in a way that any data generation or data transmission decision made by any queue in one chain is followed by the corresponding queue in the other chain as well. However, to distinguish the newly generated packets in two chains from the existing ones, we assign colors to the data packets: the existing packets in $Q^{1}_{t}$ chain are colored red and in $Q^{2}_{t}$ chain are colored blue, and the newly generated packets in both the chains are colored green. Moreover, in both the chains green (newly generated) packets get a preference in the transmission. Now, let $Q_{t}^{1,green}(u)$ represent the number of green packets in the queue of a given node $u$ in $Q^{1}_{t}$ and $\bm{\eta}(u)$ be the steady-state queue occupancy probability of Markov chain $Q_{t}$ . Since, the number of green packets in both the chains starts from zero and the chains are stochastically ordered, green packet queue occupancy is always bounded by that of the chain with stationary distribution i.e., $P\left[Q_{t}^{1,green}(u)\geq 1\right]\leq\bm{\eta}(u)$ . Same holds true for the other chain $Q^{2}_{t}$ as well.

Now, to ensure both chains get coupled all the red and blue (old) packets in $Q^{1}_{t}$ and $Q^{2}_{t}$ respectively need to be sunk. We consider $Q^{1}_{t}$ chain and the same will hold for $Q^{2}_{t}$ as well. We know by our preference in transmission, the probability that red packets move out of queue in one time step in $Q^{1}_{t}$ is equal to the probability that there are no green packets in the given queue i.e., $1-P\left[Q_{t}^{1,green}(u)\geq 1\right]$ . Also, we have $1-P\left[Q_{t}^{1,green}(u)\geq 1\right]\geq 1-\bm{\eta}(u)\geq\min_{u}1-\bm{\eta}(u)\geq 1-\bm{\eta}_{max}$ , where $\bm{\eta}_{max}=\max_{u\in V\setminus\{u_{\mbox{\scriptsize s}}\}}\bm{\eta}(u)$ . Now, let $N^{(red)}$ and $N^{(blue)}$ be the total number of red and blue data packets in chains $Q^{1}_{t}$ and $Q^{2}_{t}$ respectively at the beginning which are assumed to be finite. Also, let $T_{N^{(red)}}$ and $T_{N^{(blue)}}$ be the time taken by the the respective number of packets to get sunk. We have the following lemma that bounds this time.

Lemma 4.

Given a Data Collection process on graph $G$ with $N^{(*)}<\infty$ as the total number of data packets present in the queues of all nodes initially, then the time taken by all packets to reach the sink, let it be $T_{N^{(*)}}$ is bounded as

[TABLE]

where $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of random walk on $G$ and $\bm{\eta}_{\max}$ is the maximum queue occupancy probability at stationarity.

Proof of Lemma 4.

To prove this lemma we follow the delay analysis by Leighton et al. [14]. So, for our given Data Collection process on graph $G$ with $N^{(*)}<\infty$ as the total number of data packets present in the queues of all nodes initially, each data packet has its own trajectory or trace of random walk which indicates its path to reach the sink. Moreover, to each of these $N^{(*)}$ packets we assign distinct ranks out of range $K$ which will be determined later and the packet with the lowest rank always gets preference in the transmission. Among all possible sets of ranks assigned to the packets we choose a particular trace of random walk and find a delay sequence for it.

A delay sequence of length $s$ as defined by Leighton et al. involves backtracking the path of $s+1$ data packets where $s$ is determined by the analysis. In particular, given a data packet $p_{1}$ which arrived at the sink we follow it backwards till the edge it got delayed last time, suppose that edge is $e_{1}$ . Let $\ell_{1}$ be the length of the path from the sink to edge $e_{1}$ and suppose $p_{1}$ got delayed by packet $p_{2}$ . Then, we follow $p_{2}$ backwards till the edge where it got delayed by some packet. This is repeated till we get packet $p_{s+1}$ delayed packet $p_{s}$ over edge $e_{s}$ . So, the path from $e_{s}$ to the sink forms a delay sequence. Moreover, the intermediate $s$ paths of length $\ell_{1},\ell_{2},\cdots,\ell_{s}\geq 0$ have the property that $\sum_{i=1i}^{s}\ell_{i}\leq D$ where $D$ is the maximum number of edges that can be traversed by a trace of random walk.

Now, from our earlier argument we know that probability that any of the $N^{(*)}$ (old) packet moves out of queue in one time step is at least $1-\bm{\eta}_{\max}$ , where $\bm{\eta}_{\max}$ represents the maximum queue occupancy probability at stationarity. This means any one step in this stochastic process takes on an average $\frac{1}{1-\bm{\eta}_{\max}}$ time. So, the expected time taken by any random walk to hit the sink is $\frac{t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}}$ where $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of random walk. So, by Markov’s inequality $P\left[D\geq\frac{2t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}}\right]\leq\frac{1}{2}$ . Now, consider the probability of a random walk not hitting sink $u_{\mbox{\scriptsize s}}$ in $2(\log 1/\epsilon+2)$ times $\frac{t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}}$ i.e., we consider $\frac{t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}}2(\log 1/\epsilon+2)$ time and divide it into $(\log 1/\epsilon+2)$ slots of $\frac{2t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}}$ each. By the Markov property of random walks, we know that the random walks in each of these slots are independent. So, we have the following result.

[TABLE]

So, now there are two delays associated with any data packet: one is the self-delay of $1-\bm{\eta}_{\max}$ and the second one is due to the presence of other data packets in the queue. So, the number of different delay sequences of length $s$ is at most $N^{(*)}\cdot(N^{(*)})^{s}\cdot\binom{D+s}{s}\cdot\binom{s+K}{s+1}$ . This is because there are at most $\binom{D+s}{s}$ possibilities of choosing the intermediate path lengths $\ell_{i}$ such that $\sum_{i=1i}^{s}\ell_{i}\leq D$ as, despite of self-delay the number of steps is still upper bounded by $D$ , and then there are $N^{(*)}$ possibilities to choose packet $p_{1}$ . Similarly, for all other $s$ delay packets there are $N^{(*)}$ possibilities. The last factor comes from choosing a set of ranks from range $K$ . Moreover, probability of choosing a delay sequence such that the ranks are distinct is $1/K^{s+1}$ . So,

[TABLE]

If we set $K\geq 8N^{(*)}$ and $s=D+8N^{(*)}+\log 1/\epsilon-1$ , then

[TABLE]

Finally, combining Eq. (32) and Eq. (34) we get the desired result. ∎

Now, since both the chains $Q^{1}_{t}$ and $Q^{2}_{t}$ operate in parallel, the expected time for the two chains to couple i.e., all red and blue packets get sunk is the maximum of the time taken by each to get their respective packets sunk. So, using Lemma 4 for both the chains we have the expected time for $Q_{t}^{1}$ , $Q_{t}^{2}$ to couple, let it be $\tau_{couple}^{1,2}=\max\{T_{N^{(red)}},T_{N^{(blue)}}\}$ as

[TABLE]

Note that this expected coupling time result is similar to the delay result of Leighton et al. [15][14] depicting the pipelining behaviour of Data Collection process.

Now, to bound the distance between the two chains $Q_{t}^{1}$ and $Q_{t}^{2}$ we use the following result from Levin et al. [16].

Lemma 5 (Theorem 5.2, Levin et al. [16]).

Let $\{(X_{t},Y_{t})\}$ be a coupling with initial states $\bm{x},\bm{y}\in\mathcal{X}$ such that $X_{0}=\bm{x}$ and $Y_{0}=\bm{y}$ and coupling time defined as $\tau_{couple}:=\min\{t:X_{s}=Y_{s}\text{ for all }s\geq t\}$ , then,

[TABLE]

Let $\bm{x},\bm{y}\in(\mathbb{N}\cup\{0\})^{|V|-1}$ be the initial states of $Q_{t}^{1}$ and $Q_{t}^{2}$ chain then using Lemma 5 and the expected coupling time from Eq. (35) for $||\mathcal{P}^{t}[\bm{x},\cdot]-\mathcal{P}^{t}[\bm{y},\cdot]||_{TV}\leq\epsilon$ , we have

[TABLE]

Now, assume the stable data rate at which we are running these stochastic processes is $\beta=(1-\delta)\beta^{*}$ where $\beta^{*}$ is the critical data rate and $\delta\in(0,1)$ . Also, from linearity of $\bm{\eta}$ (see Eq. (14)) we know $\bm{\eta}_{\max}^{\beta}=\frac{\beta}{\beta^{*}}$ as $\bm{\eta}_{\max}^{\beta^{*}}=1$ and hence, we have $1-\bm{\eta}_{\max}^{\beta}=\delta$ . Using this in Eq. (36) we get the desired result. ∎

To use Theorem 3 to prove the geometric ergodicity result (Corollary 1) we pick $\bm{y}$ according to the stationary distribution $\pi$ of the Data Collection process Markov chain.

Proof of Corollary 1.

Let us consider two instances of Data Collection process $Q_{t}^{1}$ and $Q_{t}^{2}$ such that the former starts from some finite state $\bm{x}\in(\mathbb{N}\cup\{0\})^{|V|-1}$ and the latter starts from stationarity i.e., initially all queues in $Q_{t}^{1}$ are occupied by some finite number of packets and that of $Q_{t}^{2}$ are filled according to the stationary distribution $\pi$ . Then, from Theorem 3 we have

[TABLE]

where $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of random walk on graph, $N^{(\bm{x})}$ and $N^{(\pi)}$ are the total number of data packets in state $\bm{x}$ and at stationarity respectively and $\delta$ is the relative distance from the critical data rate. Now, if we compare Eq. (37) with the Definition 1 (Eq. (2)) we prove geometric ergodicity property for the Markov chain $Q_{t}^{\beta}$ .

Now for random variable $N^{(\pi)}$ , let ${\mbox{E}\left[N^{(\pi)}\right]}$ be its expectation i.e., the expected number of data packets in $Q_{t}^{\beta}$ at stationarity which by Little’s law [17] is equal to the product of the data generation rate and the expected latency of a data packet to reach the sink at the stationarity i.e., ${\mbox{E}\left[N^{(\pi)}\right]}=\frac{\beta t_{\mbox{\scriptsize hit}}}{1-\bm{\eta}_{\max}^{\beta}}=\frac{(1-\delta)\beta^{*}t_{\mbox{\scriptsize hit}}}{\delta}$ (from linearity of $\bm{\eta}$ and $\beta=(1-\delta)\beta^{*}$ ) where $\beta^{*}$ is the critical data rate and $\delta\in(0,1)$ . Now, let $\epsilon_{\bm{x}}=\max\left\{\alpha\in[0,1]:N^{(\bm{x})}\leq\frac{(1-\delta)\beta^{*}t_{\mbox{\scriptsize hit}}}{\delta}\cdot\left(\log\frac{1}{\alpha}+1\right)\right\}$ . So by the definition of $\epsilon_{\bm{x}}$ we have two regimes: $\epsilon\leq\epsilon_{\bm{x}}$ where the ${\mbox{E}\left[N^{(\pi)}\right]}\cdot\left(\log\frac{1}{\alpha}+1\right)$ term is dominant and $\epsilon>\epsilon_{\bm{x}}$ where the $N^{(\bm{x})}$ is dominant.

For the simple case of $\epsilon>\epsilon_{\bm{x}}$ , using Eq. (37) we have

[TABLE]

Similarly for $\epsilon\leq\epsilon_{\bm{x}}$ we have

[TABLE]

Setting the RHS of Eq. (39) to $\epsilon$ and solving for $t$ we get that

[TABLE]

Combining (38) and (40) gives us the result.

We observe that if we set $\bm{x}$ to $\bm{0}$ (all zeros), i.e., all queues are initially empty, then $\epsilon_{\bm{0}}$ is 1 so only Eq. (40) applies and we determine the mixing time by setting the RHS to $1/M$ for a given value of $M>0$ . ∎

6 The connection to algorithms and some future directions

The fact that the Data Collection Process mixes fast to its stationary distribution when started from the all-empty setting can be exploited to solve systems of equations such as Eq. (15) simply by allowing the process to get close enough to stationarity and then estimate the $\bm{\eta}$ by keeping track of the number of time slots for which each queue is occupied. This opens up the possibilities of distributed algorithms for effective resistance and other problems, some of which we have explored in [8]. Even if we consider graph problems on very large graphs, Laplacian systems of equations become tractable via this method since random walks can be simulated very fast in modern computing systems for graphs with nodes in the millions (see, e.g., [25]).

The key shortcoming of our work is that the Data Collection Process in the subcritical region models only one-sink Laplacian systems of equations. A model that captures the full generality of Laplacian systems of equations will open a more general class of problems that can be attacked algorithmically using this method.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. K. An and H. Cho. Efficient data collection in interference-aware wireless sensor networks. Journal of Networks , 2015.
2[2] L. Becchetti, V. Bonifaci, and E. Natale. Pooling or sampling: Collective dynamics for electrical flow estimation. In Proc. of the 17th Intl. Conf. on Autonomous Agents and Multi Agent Systems , AAMAS ’18, pages 1576–1584, Richland, SC, 2018. International Foundation for Autonomous Agents and Multiagent Systems.
3[3] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah. Gossip algorithms: Design, analysis and applications. In Proc. of the 24th Annual Joint Conf. of the IEEE Computer and Comm. Societies , INFOCOM ’05, pages 1653–1664 vol. 3. IEEE, 2005.
4[4] J. Cheeger. A lower bound for the smallest eigenvalue of the laplacian. In Proc. of the Princeton conference in honor of Professor S. Bochner , pages 195–199, 1969.
5[5] P. Christiano, J. A. Kelner, A. Mądry, D. A. Spielman, and S-H. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proc. of the 43rd annual ACM Symp. on Theory of computing , STOC ’11, pages 273–282. ACM, 2011.
6[6] A. M. Frieze, N. Goyal, L. Rademacher, and S. Vempala. Expanders via random spanning trees. volume 43, page 497–513. SIAM, 2014.
7[7] L. Georgiadis and W. Szpankowski. Stability of token passing rings. Queueing systems , 11(1-2):7–33, 1992.
8[8] I. A. Gillani and A. Bagchi. A distributed laplacian solver and its applications to electrical flow and random spanning tree computation. ar Xiv:1905.04989 [cs.DC], 2019.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A Stochastic Process on a Network with Connections to Laplacian Systems of Equations

Abstract

1 Introduction

2 Main results

2.1 Our model: The Data Collection Process

2.2 Ergodicity is a critical phenomenon for the Data Collection process

Theorem 1**.**

2.3 A lower bound on the critical rate

Theorem 2**.**

2.4 Geometric Ergodicity

Definition 1** (Geometric ergodicity).**

Theorem 3**.**

Corollary 1**.**

3 Ergodicity as a critical phenomenon

Definition 2** (Stable rate).**

Lemma 1**.**

Proof.

Lemma 2** (Szpankowski [27]).**

Lemma 3** (Loynes [18]).**

Proof of Theorem 1.

Sufficiency.

Necessity.

4 Characterizing the critical rate

4.1 Equivalence to one-sink Laplacian systems

4.2 A lower bound

Proof of Theorem 2.

4.3 An upper bound

Proposition 1**.**

Proof of Proposition 1.

5 Geometric rate of convergence

Proof of Theorem 3.

Claim 1**.**

Lemma 4**.**

Proof of Lemma 4.

Lemma 5** (Theorem 5.2, Levin et al. [16]).**

Proof of Corollary 1.

6 The connection to algorithms and some future directions

Theorem 1.

Theorem 2.

Definition 1 (Geometric ergodicity).

Theorem 3.

Corollary 1.

Definition 2 (Stable rate).

Lemma 1.

Lemma 2 (Szpankowski [27]).

Lemma 3 (Loynes [18]).

Proposition 1.

Claim 1.

Lemma 4.

Lemma 5 (Theorem 5.2, Levin et al. [16]).