Random walk based in-network computation of arbitrary functions

Iqra Altaf Gillani; Pooja Vyavahare; Amitabha Bagchi

arXiv:1702.03741·cs.DC·June 22, 2021

Random walk based in-network computation of arbitrary functions

Iqra Altaf Gillani, Pooja Vyavahare, Amitabha Bagchi

PDF

TL;DR

This paper introduces decentralized algorithms based on random walks for computing arbitrary functions in network settings, providing bounds on computation rate and time for asymmetric functions.

Contribution

It presents the first fully decentralized algorithms for arbitrary functions modeled as binary trees, using random walks, with theoretical bounds on performance.

Findings

01

Lower bound on computation rate for asymmetric functions.

02

Upper bounds on average computation time.

03

Algorithms applicable to general network topologies.

Abstract

We study in-network computation on general network topologies. Specifically, we are given the description of a function, and a network with distinct nodes at which the operands of the function are made available, and a designated sink where the computed value of the function is to be consumed. We want to compute the function during the process of moving the data towards the sink. Such settings have been studied in the literature, but mainly for symmetric functions, e.g. average, parity etc., which have the specific property that the output is invariant to permutation of the operands. To the best of our knowledge, we present the first fully decentralised algorithms for arbitrary functions, which we model as those functions whose computation schema is structured as a binary tree. We propose two algorithms, Fixed Random-Compute and Flexible Random-Compute, for this problem, both of which…

Tables2

Table 1. Table 1: Rate lower bounds of Fixed Random-Compute for various graphs

Graph	$λ_{2}$	Using Theorem 1	Using Exact Rate
Cycle	$1 - O (\frac{1}{n^{2}})$ [25]	$\frac{1}{2 \sqrt{3} n^{2} (K - 1)}$	$\frac{1}{(n - 1) (K - 1)}$
Star Graph with sink at centre
and $ϵ$ as self loop probability at each node	$0$	$\frac{1}{2 \sqrt{3} {(n - 1)}^{1 / 2} (K - 1)}$	$\frac{1 - ϵ}{(K - 1)}$
Star Graph with
sink at outer node	$0$	$\frac{1}{2 \sqrt{3} {(n - 1)}^{1 / 2} (K - 1)}$	$\frac{1}{2 (n - 1) (K - 1)}$
$m$ -dimension
Hypercube with $n = 2^{m}$	$1 - \frac{1}{\log n}$ [25]	$\frac{1}{2 \sqrt{3} \log n (K - 1)}$	$\frac{n}{(n + 2 \log n) (K - 1)}$
Complete graph	$\frac{- 1}{n - 1}$	$\frac{n}{2 \sqrt{3} (n - 1) (K - 1)}$	$\frac{n}{3 (n - 1) (K - 1)}$
Random Geometric
Graph $(G (n, r))$	$1 - Θ (r^{2})$ [31]	$\frac{\log n}{2 \sqrt{3} n (K - 1)}$	-

Table 2. Table 2: Comparison of the computation time of Fixed and Flexible Random-Compute for various graphs for β < β ∗ 𝛽 superscript 𝛽 \beta<\beta^{*} i.e. c ( β ) < 1 𝑐 𝛽 1 c(\beta)<1 (All bounds on the mixing and hitting times appear directly or implicitly in [ 1 ] .)

Graph	Fixed Random-Compute	Flexible Random-Compute
Cycle	$O (\frac{\log K}{β} + h n^{2} \log K)$	$O (\frac{\log K}{β} + h n^{3} \frac{\log K}{\log n})$
Star with sink at centre	$O (\frac{\log K}{β} + h n \log K)$	$O (\frac{\log K}{β} + h \log^{3} K)$
Hypercube	$O (\frac{\log K}{β} + h n \log K)$	$O (\frac{\log K}{β} + h n \log \log n \log K)$
Rand. $r -$ reg.	$O (\frac{\log K}{β} + h n \log K)$	$O (\frac{\log K}{β} + h n \log K)$
Torus $(d = 2)$	$O (\frac{\log K}{β} + h n \log n \log K)$	$O (\frac{\log K}{β} + h n^{2} \frac{\log K}{\log n})$
Torus $(d > 2)$	$O (\frac{\log K}{β} + h n \log K)$	$O (\frac{\log K}{β} + h n^{\frac{d + 2}{d}} \frac{\log K}{\log n})$
Rand. geometric graph	-	$O (\frac{\log K}{β} + h n^{2} \frac{\log K}{\log n})$

Equations129

P [u, v] = {\frac{1}{\mbox d e g ( u )} 0 \leavevmode \mbox i f v \in Nbd (u), \leavevmode \mbox o t h er w i se

P [u, v] = {\frac{1}{\mbox d e g ( u )} 0 \leavevmode \mbox i f v \in Nbd (u), \leavevmode \mbox o t h er w i se

t \to \infty lim P [∣∣ Q_{t}^{β} ∣ ∣_{\infty} < x] = F (x), \mbox an d x \to \infty lim F (x) = 1,

t \to \infty lim P [∣∣ Q_{t}^{β} ∣ ∣_{\infty} < x] = F (x), \mbox an d x \to \infty lim F (x) = 1,

\frac{1 - λ _{2}}{2 3 ( K - 1 )} (\frac{d _{min}}{d _{ma x}})^{1/2} \leq β \leq δ,

\frac{1 - λ _{2}}{2 3 ( K - 1 )} (\frac{d _{min}}{d _{ma x}})^{1/2} \leq β \leq δ,

τ_{\mbox f (K)}^{ℓ} := t min {Y_{f_{K} (j)}^{t} = u_{\mbox s}, \forall \leavevmode 1 \leq j \leq ℓ} .

τ_{\mbox f (K)}^{ℓ} := t min {Y_{f_{K} (j)}^{t} = u_{\mbox s}, \forall \leavevmode 1 \leq j \leq ℓ} .

\overset{τ}{ˉ}_{\mbox f (K)} = ℓ \to \infty lim \frac{τ _{\mbox f (K)}^{ℓ}}{ℓ}

\overset{τ}{ˉ}_{\mbox f (K)} = ℓ \to \infty lim \frac{τ _{\mbox f (K)}^{ℓ}}{ℓ}

\overset{τ}{ˉ}_{\mbox f (K)} \leq α lo g K (\frac{1}{β} + \frac{h t _{\mbox hi t}}{1 - c ( β )})

\overset{τ}{ˉ}_{\mbox f (K)} \leq α lo g K (\frac{1}{β} + \frac{h t _{\mbox hi t}}{1 - c ( β )})

\overset{τ}{ˉ}_{\mbox f (K)} \leq \overset{α}{^} lo g K (\frac{1}{β} + \frac{h t _{\mbox mi x}^{G}}{1 - c ( β )} (lo g^{2} K + \frac{n}{ν lo g n}))

\overset{τ}{ˉ}_{\mbox f (K)} \leq \overset{α}{^} lo g K (\frac{1}{β} + \frac{h t _{\mbox mi x}^{G}}{1 - c ( β )} (lo g^{2} K + \frac{n}{ν lo g n}))

β \geq (1 - λ_{2}) \frac{d _{min}}{d _{ma x} \leavevmode 2 s ( s + 1 )}

β \geq (1 - λ_{2}) \frac{d _{min}}{d _{ma x} \leavevmode 2 s ( s + 1 )}

δ_{i} := U \subset V : i \in U, u_{\mbox s} \in / U min u \in U, v \in / U \sum P [u, v] .

δ_{i} := U \subset V : i \in U, u_{\mbox s} \in / U min u \in U, v \in / U \sum P [u, v] .

P [A] = i = 1 \sum ℓ (ℓ - i t _{u}) β^{ℓ - i} (1 - β)^{t_{u} - (ℓ - i)}

P [A] = i = 1 \sum ℓ (ℓ - i t _{u}) β^{ℓ - i} (1 - β)^{t_{u} - (ℓ - i)}

P [\exists u : t_{u} \geq w ℓ] \leq K ℓ e^{- ℓ}

P [\exists u : t_{u} \geq w ℓ] \leq K ℓ e^{- ℓ}

\mbox E [\hat{t}_{u} - lo g eK]

\mbox E [\hat{t}_{u} - lo g eK]

\leq w j = 1 \sum \infty P [\hat{t}_{u} \geq w (lo g eK + j)]

\leq w j = 1 \sum \infty K (lo g eK + j) e^{- (l o g eK + j)}

\leq w lo g eK

\mbox E [\hat{t}_{u}] \leq lo g eK + w lo g eK \leq w lo g eK

\mbox E [\hat{t}_{u}] \leq lo g eK + w lo g eK \leq w lo g eK

τ_{\mbox a pp}^{ℓ} \leq ℓ lo g eK \frac{b}{β}

τ_{\mbox a pp}^{ℓ} \leq ℓ lo g eK \frac{b}{β}

τ_{T_{i}} \leq

τ_{T_{i}} \leq

+ max {τ_{T_{i - 2, 00}}, τ_{T_{i - 2, 01}}, τ_{T_{i - 2, 10}}, τ_{T_{i - 2, 11}}} .

τ_{T_{h}} \leq i = 1 \sum h 0 \leq j \leq 2^{i - 1} - 1 max {τ_{(ϕ (i, 2 j), ϕ (i, 2 j + 1); ϕ (i - 1, j))}} .

τ_{T_{h}} \leq i = 1 \sum h 0 \leq j \leq 2^{i - 1} - 1 max {τ_{(ϕ (i, 2 j), ϕ (i, 2 j + 1); ϕ (i - 1, j))}} .

\mbox E [τ_{ma x} - lo g eK]

\mbox E [τ_{ma x} - lo g eK]

\leq e t_{\mbox hi t} j = 1 \sum \infty P [τ_{ma x} \geq (lo g eK + j) e t_{\mbox hi t}]

\leq e t_{\mbox hi t} j = 1 \sum \infty 2 (K - 1) e^{- l o g eK} e^{- j}

\leq 2 t_{\mbox hi t} \frac{K - 1}{K ( e - 1 )}

\mbox E [τ_{ma x}] \leq lo g eK + 2 t_{\mbox hi t} \frac{K - 1}{K ( e - 1 )} \leq t_{\mbox hi t} lo g eK

\mbox E [τ_{ma x}] \leq lo g eK + 2 t_{\mbox hi t} \frac{K - 1}{K ( e - 1 )} \leq t_{\mbox hi t} lo g eK

P [\mbox P a c k e t p \mbox i s d e l a y e d a t u \mbox a tt im e t] =

P [\mbox P a c k e t p \mbox i s d e l a y e d a t u \mbox a tt im e t] =

w \geq 2 \sum P [\mbox P a c k e t p \mbox i s n o tp i c k e d a t u \mbox a tt im e t \cap Q_{t}^{β} (u) = w ∣ X_{p}^{t} = u] P [X_{p}^{t} = u]

P [\mbox P a c k e t p \mbox i s n o tp i c k e d a t u \mbox a tt im e t \cap Q_{t}^{β} (u) = w ∣ X_{p}^{t} = u] =

P [\mbox P a c k e t p \mbox i s n o tp i c k e d a t u \mbox a tt im e t \cap Q_{t}^{β} (u) = w ∣ X_{p}^{t} = u] =

\frac{w - 1}{w} \frac{P [ Q _{t}^{β} ( u ) = w \cap X _{p}^{t} = u ]}{P [ X _{p}^{t} = u ]}

P [\mbox P a c k e t p \mbox i s n o tp i c k e d a t u \mbox a tt im e t \cap Q_{t}^{β} (u) = w ∣ X_{p}^{t} = u] \leq

P [\mbox P a c k e t p \mbox i s n o tp i c k e d a t u \mbox a tt im e t \cap Q_{t}^{β} (u) = w ∣ X_{p}^{t} = u] \leq

\frac{w - 1}{w} \frac{P [ Q _{t}^{β} ( u ) = w ]}{P [ X _{p}^{t} = u ]} \leq \frac{P [ Q _{t}^{β} ( u ) = w ]}{P [ X _{p}^{t} = u ]}

P [\mbox P a c k e t p \mbox i s d e l a y e d a t u \mbox a tt im e t] \leq w \geq 2 \sum P [Q_{t}^{β} (u) = w] = P [Q_{t}^{β} (u) \geq 2]

P [\mbox P a c k e t p \mbox i s d e l a y e d a t u \mbox a tt im e t] \leq w \geq 2 \sum P [Q_{t}^{β} (u) = w] = P [Q_{t}^{β} (u) \geq 2]

P [\mbox P a c k e t p \mbox i s d e l a y e d a tt im e t] =

P [\mbox P a c k e t p \mbox i s d e l a y e d a tt im e t] =

u \in V \sum P [\mbox P a c k e t p \mbox i s d e l a y e d a t u \mbox a tt im e t ∣ X_{p}^{t} = u] P [X_{p}^{t} = u]

τ_{\mbox co m p} (ℓ) \leq ℓ τ_{\mbox co m p} (1) = ℓ (h \frac{t _{\mbox hi t} lo g eK}{1 - c ( β )}) .

τ_{\mbox co m p} (ℓ) \leq ℓ τ_{\mbox co m p} (1) = ℓ (h \frac{t _{\mbox hi t} lo g eK}{1 - c ( β )}) .

τ_{\mbox f (K)}^{ℓ} = τ_{\mbox a pp}^{ℓ} + τ_{\mbox co m p} (ℓ) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Random Walk-based In-network Computation of Arbitrary Functions

Iqra Altaf Gillani

Department of Computer Science and Engineering, IIT Delhi

{iqraaltaf,bagchi}@cse.iitd.ac.in

Pooja Vyavahare

School of Computing and Electrical Engineering, IIT Mandi

[email protected]

Amitabha Bagchi

Department of Computer Science and Engineering, IIT Delhi

{iqraaltaf,bagchi}@cse.iitd.ac.in

Abstract

We study in-network computation on general network topologies. Specifically, we are given the description of a function, and a network with distinct nodes at which the operands of the function are made available, and a designated sink where the computed value of the function is to be consumed. We want to compute the function during the process of moving the data towards the sink. Such settings have been studied in the literature, but mainly for symmetric functions, e.g. average, parity etc., which have the specific property that the output is invariant to permutation of the operands. To the best of our knowledge, we present the first fully decentralised algorithms for arbitrary functions, which we model as those functions whose computation schema is structured as a binary tree. We propose two algorithms, Fixed Random-Compute and Flexible Random-Compute, for this problem, both of which use simple random walks on the network as their basic primitive. Assuming a stochastic model for the generation of streams of data at each source, we provide a lower and an upper bound on the rate at which Fixed Random-Compute can compute the stream of associated function values. Note that the lower bound on rate though computed for our algorithm serves as a general lower bound for the function computation problem and to the best of our knowledge is first such lower bound for asymmetric functions. We also provide upper bounds on the average time taken to compute the function, characterising this time in terms of the fundamental parameters of the random walk on the network: the hitting time in the case of Fixed Random-Compute, and the mixing time in the case of Flexible Random-Compute.

Keywords: Function computation, random walks, coalescing random walks, stable rate

1 Introduction

Since most commercially available sensor nodes used in today’s sensor networks are capable of performing small operations on the data, distributed function computation (also known as in-network computation) algorithms seek to exploit the computation capability of these nodes to increase the efficiency of communication and computation of function [22, 24, 19] over plain data forwarding techniques [20, 17]. The in-network computation paradigm is based on the following simple idea: instead of moving all the distinct data items generated at different nodes of the network to the sink and computing the function of interest at the sink, we leverage the meetings of the data items at intermediate nodes to compute partial functions which can then be combined at the sink. The idea is to reduce the load on the network and thereby increase the rate at which the data can be read.

Clearly, the key question here is: is it always possible to combine any two data items that meet at a node? In the simplest scenario, the answer is yes, and the class of functions for which the answer is yes are known in the literature as “symmetric functions” and most of the existing literature on in-network function computation mainly deals with such functions (see, e.g.,[15]). In this paper, our aim is to study the distributed computation of arbitrary functions where the sequence of operations of the data is important to compute the final function value. Generalising from the class of symmetric functions requires us to be able to specify the kinds of functions we handle and typically functions are specified by a computation schema which describes the structure of allowable combinations of operands and partial functions that make up the final function being computed. In particular, we study a class of functions whose computation schema can be represented by a binary tree. Another major contribution of our work is to distinguish between two modes of computing intermediate values (partial functions) that make up a larger function contribution, the distinction made on the basis of whether the node at which a specific intermediate value is computed is specified in advance (we call it the “Fixed” scenario) or not (the “Flexible” scenario.)

The distributed computation of functions with computation schema modelled by a directed graph has recently attracted some attention in the literature [32, 26, 34], but these works typically take a centralised approach to the problem, centrally computing routings that realise the function within the network. Our approach, based on random walks, is fundamentally decentralised. Such a communication strategy is very useful when the network is changing constantly and the routing information can become invalid frequently. Any node of the network in our algorithm does not need global information (like the number of nodes or topology of the network) and the communication depends only on the knowledge of the neighbourhood. In our algorithms, multiple data packets move across the network leading to multiple random walks in the network. In order to compute the function in the network, our algorithms combine these data packets in the order defined by the computation schema of the function. This combination of packets leads to coalescence of random walks which have been studied in [9, 21].

Our contributions

We describe two in-network computation scenarios: (a) the fixed scenario under which each internal node of the function schema is mapped to a specific network node that is tasked with computing the subfunction corresponding to that internal node of the schema and (b) the flexible scenario in which a subfunction computation takes place opportunistically at any network node that happens to have the two relevant operands. 2. 2.

We propose simple, decentralised random walk-based algorithms, Fixed Random-Compute and Flexible Random-Compute, to compute a function with a binary tree computation schema in the fixed and flexible network scenarios respectively. 3. 3.

We present upper and lower bounds on the rate of computation for Fixed Random-Compute in a setting where data is generated stochastically in rounds. In particular, the lower bound presented is a general lower bound for the function computation problem and to the best of our knowledge is the first of its kind for asymmetric functions. We also find the average function computation time taken by these algorithms.

Related work

As discussed earlier, most of the extant literature study the in-network computation of symmetric functions (see [15, 28, 12, 18]) where the sequence of computation does not matter. Researchers have studied the computation time as well as the rate of computation of various classes of symmetric functions for different network models. In particular, [15] study the rate of computation of classes of symmetric functions namely type-thereshold (example: maximum) and type-sensitive (example: average) for collocated and random planar multihop networks. Following the work of [15], various authors studied these functions in various network settings; for example [12] studied for noisy communication model, [23] for grid networks and [24] for random geometric graphs. Further, Banerjee et. al. [4] study the rate of computation of divisible functions which can be computed by a divide-and-conquer method on any subset of the source data. Another variant studied in the literature is of $\lambda_{f}$ -divisible functions which can be computed by performing operations on at most $\lambda_{f}$ source data values at any time [22]. Note that all the work discussed above only present upper bounds on the rate. However, Kamath et al. [19] find a lower bound for computing a symmetric function MAX in a pipelined setting different than ours, which requires knowledge about network structure. To the best of our knowledge no lower bound result has been presented for asymmetric functions so far.

In this work, we not only extend the function computation study to the class of arbitrary functions but also present a general lower bound on the rate of function computation which is a first lower bound presented for this class of problem. In particular, we present random walk-based algorithms to compute arbitrary functions with binary tree schema. The communication strategy in our setting is similar to that of gossip algorithms [6, 28, 31] where each node sends a data packet to a randomly chosen neighbour in every time slot. In this setting, the time to compute the average of data values is studied by [6] and separable functions of data are studied in [28]. Separable functions are a class of functions which can be written as the sum of functions on the subsets of source data values. Mosk-Aoyama and Shah [28] give an upper bound on the computation of such functions which is directly proportional to the logarithm of the size of the network and inversely proportional to the conductance of the network. The analysis and thereby the results of gossip algorithms depends on the random walk properties which in turn depends on the transition matrix of the random walk. Our results also depend on the random walk parameters like the spectral gap of transition matrix, hitting time and mixing time.

Moreover, in our setting movement of different data operands in the network essentially leads to multiple random walks on the network. These random walks combine with each other based on the sequence defined by the computation schema resulting in the coalescence of random walks. Properties of multiple random walks have been studied in the literature [2, 10, 13, 29]. Interaction of multiple random walks has been studied by [10] where two random walks either annihilate each other or combine when they meet and a bound on the cover time of such walks has been presented. Coalescing random walks (two random walks combine when they meet) has been studied by [9, 21]. An upper bound on the time to coalesce $n$ random walks each starting from different vertices in the graph is given by [9]. Recently [21] improved this bound on coalescence time for various kinds of graphs. Our analysis of algorithms also involves multiple and coalescing random walks.

Paper organization

In Section 2, we describe our in-network computation model, the data generation model (Section 2.1), the class of functions we deal with (Section 2.2) and the two in-network computation scenarios (Section 2.3) that we propose: the fixed and the flexible. In Section 3, we first define the routing scheme used by our algorithms and then we discuss our two random walk-based algorithms for the two scenarios in Section 3.2 and Section 3.3 respectively. In Section 4, we define our performance metrics and discuss our main results with some examples. In Section 5, we present the proofs of our theorems. In particular, we discuss the proof of the rate analysis of Fixed Random-Compute algorithm in Section 5.1 and in Section 5.2 we analyse the average time taken by our algorithms in computing an asymmetric function in both fixed and flexible models. We conclude in Section 6 and provide some directions for future work.

2 Modelling Assumptions and Computation Scenarios

In this section, we first describe our network, data generation and computation model. Then, we define the particular function schema we work with and discuss the two in-network computing scenarios, the “Fixed” and the “Flexible” where such functions can be computed.

2.1 Network and Computation Model

The network model

The communication network is denoted by an undirected connected graph $G=(V,E),$ where $V$ is the set of $n$ nodes, $V_{s}\subseteq V$ with $|V_{s}|=K$ is the set of source nodes and $E$ is the set of $m$ edges. An edge $e=(u,v)$ is present between nodes $u,v\in V$ if they can communicate with each other and we denote $\mathsf{Nbd}(u):=\{v\in V|(u,v)\in E\}$ . The nodes in the network follow a slotted time model for communication and we assume that each node can send at most one data packet to a single neighbour in any time slot. This is known as the transmitted gossip constraint in the gossip literature [28]. Note that under this network model a node may receive multiple packets in one time slot.

The model of computation in the network

We need to compute a function $f_{K}$ defined as $f_{K}:=f_{K}(x_{1},x_{2},\ldots,x_{K})$ in the given network. The operand $x_{i}$ is generated by a data generation model at source node $u_{i}\in V_{s}$ and we have the source set $V_{s}=\{u_{1},u_{s},\ldots,u_{K}\}$ . We need to compute $f_{K}(x_{1},x_{2},\ldots,x_{K})$ and make it available at a designated node of the network called the sink $u_{\mbox{\scriptsize s}}\in V$ . We work within the paradigm of in-network computation [4] for computing $f_{K}$ , i.e., nodes can compute intermediate functions called subfunctions of the data while relaying it in the network. We elaborate this further when we discuss the two specific in-network computation scenarios we work with.

The data generation model

We consider the data generation process at each source node $u_{i}\in V_{s}$ as a stochastic arrival process in discrete time that is Bernoulli with parameter $\beta$ and independent of the arrivals taking place at all other nodes, i.e. at each time slot $t$ each node generates a new data packet with probability $\beta$ independent of all other nodes. We partition the data generated into rounds: the $\ell^{th}$ data packet to be generated at source node $u_{i}\in V_{s}$ is said to be part of round $\ell$ . If we denote this data item as $x_{i}(l)$ then note that the function computed in round $\ell$ is $f_{K}(x_{1}(\ell),x_{2}(\ell),\ldots,x_{K}(\ell))$ . So, each data packet is identified by an identifier $\ell$ which denotes their associated round. We will refer to this model of data generation as independent Bernoulli data generation with parameter $\beta$ . We will use this data generation model for analysing our algorithms and then in Section 6 we will also discuss two other data generation models and present our latency results in their context.

2.2 The Binary Tree Function Schema

After defining our network model, model of computation and data generation model now we will define the type of functions we work with i.e. particular class of asymmetric or arbitrary functions. Recall, a function is symmetric if the output value does not depend on the permutation of input values and sequence of intermediate operations does not matter. To generalise from the notion of symmetric to asymmetric we need some language to describe function classes that allow only specific argument combinations. We use the abstraction of a directed acyclic graph to capture the partial combinations that are possible in a given function. We call this graph to be the computation schema of the function. In this work, we study a class of functions whose computation schema is a directed tree. An example is shown in Figure 1.

Our algorithms and results are presented for a binary tree computation schema, but this does not, in fact, lead to any great loss in generality when it comes to the class of asymmetric functions. In the function computation schema, any intermediate node representing unary operation can be merged with its parent node as it requires only one operand. The unary operation can be performed by the network node which performs the operation for its parent node. Thus a function computation schema can have only binary and $M$ -ary operations. Our results of binary tree schema can easily be extended to arbitrary directed trees (like $M$ -ary trees) and we discuss these extensions in Section 6. With this in mind, we now formalise the notion of binary tree computation schema and present the notation we will use in the subsequent sections.

Let $f_{K}:=f_{K}(x_{1},x_{2},\ldots,x_{K})$ be a function whose schema is described by a binary tree $\mathcal{T}$ with $L\leq K$ leaves as sources and height $h$ where $\log K\leq h<K$ . For complete binary tree $L=K$ , where $K=2^{r}$ for some value $r$ and $h=\log K$ . So, let the root of the tree where finally $f_{K}$ is computed be at level 0 and have id $\mathcal{T}(0,0)$ ; note it is the only node at this level. In general, level $i$ has at most $2^{i}$ nodes with id set $\{\mathcal{T}(i,0),\mathcal{T}(i,1),\ldots,\mathcal{T}(i,2^{i}-1)\}$ . Let $\theta(i,j)$ be the value computed at node with id $\mathcal{T}(i,j)$ . For level $i=h$ i.e. the leaf level, each node performs an identity function on data so, $\theta(i,j)=x_{j+1}$ for $0\leq j<2^{i}$ and in general for level $0\leq i<h$ and $0\leq j<2^{i}$ , node with id $\mathcal{T}(i,j)$ computes the function $\theta(i,j)=\theta(i+1,2j)\oplus\theta(i+1,2j+1)$ , where $\oplus$ is the binary operation to be performed by node $\mathcal{T}(i,j)$ specified by the function schema. Note that in case of a non-complete binary tree such node $\mathcal{T}(i,j)$ can compute identity function as well if it corresponds to a source node. Also, we have $f_{K}=\theta(0,0)$ .

Consider an example schema for function $f_{4}=x_{1}x_{2}+x_{3}x_{4}$ as shown in the Figure 1a. Note that it is a complete binary tree with height $h=2$ and the source nodes are present at the leaf level. The data packets $x_{1},x_{2}$ are called the operands for the subfunction $x_{1}x_{2}$ and are obtained by the identity functions on the respective nodes i.e. $\theta(2,0)=x_{1}$ and $\theta(2,1)=x_{2}$ . Also, the nodes labelled $\mathcal{T}(1,0),\mathcal{T}(1,1),\mathcal{T}(0,0)$ represent the specific operations that need to be performed on the data. For example, node $\mathcal{T}(1,0)$ performs the multiplication operation and represents the subfunction $\theta(1,0)=\theta(2,0)\times\theta(2,1)=x_{1}x_{2}$ for the function $\theta(0,0)=f_{4}=x_{1}x_{2}+x_{3}x_{4}$ . Next, consider a non-complete binary tree schema for function $f_{4}=(y_{1}y_{2}+y_{3})y_{4}$ as shown in the Figure 1b with height $h=3$ . In this case, apart from the leaf level source nodes are present at other levels as well and they perform identity function like $\theta(2,1)=y_{3}$ and $\theta(1,1)=y_{4}$ . All other nodes perform specific operation on data like node $\mathcal{T}(0,0)$ combines the subfunctions $(y_{1}y_{2}+y_{3})$ and $y_{4}$ to compute function $\theta(0,0)=f_{4}=(y_{1}y_{2}+y_{3})y_{4}$ .

2.3 In-network Computation Scenarios

Now, for the given function schema i.e. the binary tree schema let us look at the two computation scenarios by which we can compute such functions. These two scenarios differ based on whether the node at which a specific intermediate value is to be computed is known in advance or not.

In-network computation scenario 1: The Fixed model

In this model, any subfunction of computation schema $\mathcal{T}$ can be performed only at a specific node in the network i.e. we are given a mapping $\phi:\mathcal{T}\mapsto V.$ We assume $\phi$ is a one-to-one mapping. The schema node id $\phi^{-1}(u)$ is hard-wired into $u\in V$ at the time of deployment of the network so that $u$ knows the specific operation which it has to perform along with the data identifiers of the operands that are the arguments of that operation. Every node $u$ that is in the codomain of $\phi$ maintains two queues; one for storing the data operands of the operation specified by $\phi^{-1}(u)$ , we call it $\mathcal{C}_{t}(u)$ , and another for data transmission, namely $Q_{t}(u)$ . Once packets of both data operands are received in $\mathcal{C}_{t}(u),$ node $u$ performs the operation defined by $\phi^{-1}(u)$ and stores the generated data packet in $Q_{t}(u).$ This increases the data transmission queue size by one. Packets other than these operands are directly stored in $Q_{t}(u)$ for future transmissions.

Consider the schema of Figure 1a and the network of Figure 2a with a fixed mapping of nodes in schema to the nodes of the network. Note, $V_{s}=\{x_{1},x_{2},x_{3},x_{4}\}$ is the source set for function $f_{4}$ which is mapped to schema nodes $\mathcal{T}(2,0),\mathcal{T}(2,1),\mathcal{T}(2,2),\mathcal{T}(2,3)$ respectively and nodes $\mathcal{T}(1,0),\mathcal{T}(1,1),\mathcal{T}(0,0)$ of schema are mapped to nodes $t,v,w$ of the network respectively. So, network nodes $t,v,w$ know the ids of their operands and the function they need to compute and maintain both data operand queue and data transmission queue. On the other hand, network node $u$ (Figure 2b) is not mapped to any node in schema so it maintains only data transmission queue and relays data $x_{1},x_{2}$ rather than performing the operation specified by $\mathcal{T}(1,0)$ on them (see Figure 2c). The operation of $\mathcal{T}(1,0)$ is performed by node $t$ at the end of $5^{th}$ time slot (Figure 2d). Note that the network node $w$ stores the operand $x_{3}x_{4}$ in its data operand queue till it receives the other operand to perform the operation specified by $\mathcal{T}(0,0)$ in Figures 2d, 2e.

In-network computation scenario 2: The Flexible model

In the flexible network model, every node in the network knows the entire schema $\mathcal{T}$ and can perform any operation within it. If the operand data packets for any subfunction of $\mathcal{T}$ are available at any network node $u\in V$ at any time $t$ then it performs the required operation and creates the data packet for the corresponding subfunction. Every node $u$ at time $t$ in the network maintains a single queue $Q_{t}(u)$ of data packets which it has received (or generated) so far, and has not transmitted (or used for generating any subfunction) yet. For a received packet, if the corresponding operand is available in the queue then they are combined i.e. we perform coalescence of packets and the new packet is stored in the queue for future transmission. On the other hand, if the corresponding operand for the received packet is not present in the queue then the received packet is simply stored in the queue without any coalescence.

Consider the schema $\mathcal{T}$ of Figure 1 and network $G$ of Figure 3a. In this model, any node can compute the subfunctions of schema provided it has both the data operands required for computing the subfunction. See Figure 3b, where network node $u$ receives data packets $x_{1},x_{2}$ at the end of two time slots and performs the operation of node $\mathcal{T}(1,0)$ generating $x_{1}x_{2}.$ Similarly, node $v$ of the network performs the operation of schema node $\mathcal{T}(1,1)$ and $w$ performs the operation of schema node $\mathcal{T}(0,0)$ in the network. All other nodes of the network relay the data packets.

3 Our Algorithms

In this section, we first discuss the random walk-based routing primitive that we use and then we describe our two algorithms for the two in-network computing scenarios discussed before.

3.1 Routing Scheme Used by Our Algorithms

The nodes in the network follow a slotted time model for communication and only one packet can be transmitted over an edge in a slot. Our algorithms use a push communication model [28] where at the start of any time step $t$ , every node $u\in V$ selects a node $v$ with probability

[TABLE]

independent of other nodes and previous time step selections and sends it a randomly chosen data packet from the queue $Q_{t}(u)$ (if it is not empty). Note that this corresponds to the data packets performing simple random walk on the graph of the network. Next, we present the algorithms for the two in-network computation scenarios.

3.2 Fixed Random-Compute Algorithm

In the fixed model, we use the algorithm Fixed Random-Compute to compute $f_{K}$ (see Algorithm 1). In this algorithm, when a node which has a subfunction of $\mathcal{T}$ mapped to it receives a packet it checks to see if the packet is one of the operands of that subfunction. If it is then the node checks its operand queue to see if it has the other operand. If the other operand is available, it combines the two as per the subfunction and moves the combination into its transmission queue. If not, it stores the received packet in the operand queue. If the received packet is not relevant to the subfunction mapped to this node, the node simply places it in its transmission queue for the onward relay. In every time step, the node also chooses a packet uniformly at random and transmits it to a neighbour chosen according to the probability distribution $\mathcal{P}[\cdot,\cdot]$ . An example run of the Fixed Random-Compute algorithm is shown in Figure 2.

3.3 Flexible Random-Compute Algorithm

We propose the algorithm Flexible Random-Compute for the flexible computation scenario (see Algorithm 2). This algorithm works by performing combinations allowed by $\mathcal{T}$ opportunistically: when a node receives a new packet from a neighbour it checks its transmission queue to see if that packet can be combined with any packet currently in the queue. If such a combination is possible, it performs it and places the combined value in its transmission queue. As in the case of Fixed Random-Compute here too in every time step the node also chooses a packet uniformly at random and transmits it to a neighbour chosen according to the probability distribution $\mathcal{P}[\cdot,\cdot]$ . An example run of the Flexible Random-Compute algorithm is shown in Figure 3.

4 Performance Metrics and Our Results

In this section, we define our performance metrics, state the main theorems reflecting analysis of those metrics and then, we discuss the consequences of those theorems using examples based on some common network topologies.

Typically in-network computation algorithms are analysed on two metrics: (a) the rate at which the computation can be carried out given a model of regular data generation [28, 22] and (b) the computation time or the delay in computing the function under the assumption that all operands are available at their respective source nodes. These two correspond to notions of the throughput and latency in the in-network computation setting.

For our rate analysis, we consider the independent Bernoulli data generation model with parameter $\beta$ as described in Section 2.1. Note that this data generation model is stochastic and we use techniques from queuing theory to define the best possible function computation rate. The condition for the system to continue processing the data in a regular fashion is known as stability in the queueing literature [16] and is characterised by the fact that the expected size of each queue is finite. A rate $\beta$ that allows a given algorithm to achieve this condition is termed as the stable rate of computation for it.

Formally, following Szpankowski [33], we formally define the stable rate of computation of any in-network computation algorithm working under the independent Bernoulli data generation model as follows:

Definition 1 (Stable rate of computation).

For any in-network computation algorithm with the data generation as independent Bernoulli process with parameter $\beta$ at all source nodes and with $\lvert V\rvert$ dimensional vector $Q_{t}^{\beta}$ representing the state of the network at time $t$ , the data rate $\beta$ is said to be stable if the following holds

[TABLE]

where $||Q^{\beta}_{t}||_{\infty}=\max\{Q^{\beta}_{t}(u):u\in V\}$ and $F(x)$ is the limiting distribution.

We have dropped superscript $\beta$ from our data transmission queue representation where the rate is understood. Now, we will present the bounds for such a stable rate for the Fixed Random-Compute algorithm.

Theorem 1.

Given a network $G=(V,E)$ with source set $V_{s}\subset V$ of $|V_{s}|=K$ source nodes each receiving independent Bernoulli arrivals with stable rate $\beta$ and a function schema $\mathcal{T}$ to compute a function $f_{K}$ with binary tree schema $\mathcal{T}$ on $K$ sources, the stable rate $\beta$ of Fixed Random-Compute algorithm is given by:

[TABLE]

where $\lambda_{2}$ is the second largest eigenvalue of the transition matrix $\mathcal{P}$ of Fixed Random-Compute, $d_{min}$ and $d_{max}$ are the minimum and maximum degree of graph $G$ , $\delta_{i}:=\min_{U\subset V:i\in U,u_{\mbox{\scriptsize s}}\notin U}\sum_{u\in U,v\notin U}\mathcal{P}[u,v]$ is the mincut between node $i$ and sink $u_{\mbox{\scriptsize s}}$ and $\delta:=\min_{i\in V}\delta_{i}$ is the min-mincut of $G$ .

Regarding the stable rate for the flexible model, we do not have any bounds for this model for now. We discuss it in Section 6 as part of the future work.

Rate results: Discussion and examples

In Table 1, we present two rate bounds for some common network topologies. First we present the lower bounds obtained from Theorem 1 and then we present the lower bounds obtained by calculating the exact values using elementary algebra (see [14] for details). We note that for the complete graph the lower bound given by Theorem 1 is tight up to small constants, but in the case of the cycle a direct calculation gives us a rate that is higher by $\Theta(n)$ .

Our next metric related to the delay or the latency is what we call the average function computation time of an in-network computation algorithm. Now, under our regular data generation model with rate $\beta$ let the computed value of function $f_{K}$ for the $j^{th}$ data round be $f_{K}(j):=f_{K}(x_{1}(j),x_{2}(j),\ldots,x_{K}(j))$ and let random variable $Y_{f_{K}(j)}^{t}$ denote the position of the $j^{th}$ round computed value $f_{K}(j)$ at the start of time slot $t$ given that the data packets perform simple random walk on $G$ . Then, the function computation time of $\ell$ data rounds $\tau_{\mbox{\scriptsize f(K)}}^{\ell}$ is defined as,

[TABLE]

So, this represents the earliest time by which the computed value of $\ell$ data rounds is available at the sink. So, now we can define the average computation time as follows.

Definition 2 (Average function computation time).

The average function computation time for the network is defined as

[TABLE]

where $\tau_{\mbox{\scriptsize f(K)}}^{\ell}$ is the the function computation time of the first $\ell$ rounds of data.

For Fixed Random-Compute we prove the following theorem:

Theorem 2 (Fixed model computation time).

Given a network $G=(V,E)$ of $|V|=n$ nodes with set $V_{s}\subset V$ of $|V_{s}|=K$ source nodes each receiving independent Bernoulli arrivals with stable rate $\beta$ and a function $f_{K}$ with binary tree schema $\mathcal{T},$ for $\beta\leq 1/2$ , the average function computation time of $f_{K}$ using Fixed Random-Compute is

[TABLE]

where $\alpha>1$ is a constant, $h$ is the height of the binary tree schema $\mathcal{T}$ , $c(\beta)\in[0,1]$ is a continuous and increasing function of $\beta$ with $c(0)=0$ and $c(\beta)\rightarrow 1$ as $\beta\rightarrow\beta^{*}$ where $\beta^{*}$ is the critical rate below which data rates are stable and above which they are unstable and $t_{\mbox{\scriptsize hit}}=\max\limits_{x,y\in V}\mathbb{E}_{x}(\tau_{y})$ is the worst-case hitting time of simple random walk on $G$ .

For Flexible Random-Compute we have the following theorem:

Theorem 3 (Flexible model computation time).

Given a network $G=(V,E)$ of $n$ nodes with a source set $V_{s}\subset V$ of $|V_{s}|=K$ source nodes each receiving independent Bernoulli arrivals with stable rate $\beta$ and a function $f_{K}$ with binary tree schema $\mathcal{T},$ for $\beta\leq 1/2$ , the average function computation time of $f_{K}$ using Flexible Random-Compute is

[TABLE]

where $\hat{\alpha}>1$ is a constant, $h$ is the height of the binary tree schema $\mathcal{T}$ , $c(\beta)\in[0,1]$ is a continuous and increasing function of $\beta$ with $c(0)=0$ and $c(\beta)\rightarrow 1$ as $\beta\rightarrow\beta^{*}$ where $\beta^{*}$ is the critical rate below which data rates are stable and above which they are unstable. Here, $t_{\mbox{\scriptsize mix}}^{G}$ is the mixing time of simple random walk on $G$ , $\nu=\sum_{v\in V}{(\mbox{deg}(v)^{2})}/{d^{2}n}$ where, $\mbox{deg}(v)$ is the degree of any vertex $v\in V$ and $d=2m/n$ is the average vertex degree.

Computation time results: Discussion and examples

Before comparing the two models, let us look at an important parameter $\nu$ in the flexible model latency result which represents the variability in the vertex degree. Recall $\nu=\sum_{v\in V}{(\mbox{deg}(v)^{2})}/{d^{2}n}$ where, $d$ is the average vertex degree. For regular graphs like cycle $\nu=1$ whereas for skewed graphs like star $\nu=O(n)$ . This parameter affects the flexible model latency as we can see the star graph has a much lower latency then the cycle graph (see Table 2). This is because in the skewed graphs like star higher degree nodes allow more coalescences which are critical for our analysis, whereas it is not so in the regular graphs as all nodes are similar.

In Table 2, we compare the computation time of Fixed and Flexible Random-Compute for various graphs given that $\beta<\beta^{*}$ i.e. $c(\beta)<1$ where $c(\beta)\in[0,1]$ is a continuous and increasing function of $\beta$ . The given condition ensures that the queues at the nodes are finite and function computation is carried in stable manner.

Consider Rand $r$ - reg graph, in this case the performance of both the algorithms is same whereas in the cycle graph Fixed Random-Compute is better than the Flexible Random-Compute but in the star graph latter dominates the former. So, the upper bounds in the table do not clarify whether Flexible Random-Compute is better from a computation time point of view or Fixed Random-Compute. Let us now analyse two models based on their working. Under the fixed schema, two operands that can be combined may meet at some node but may not be combined since the node at which they meet is not the designated node where they are to be combined. This is a major disadvantage that the flexible schema does not have: in the flexible schema the first time two combinable operands meet they are combined. However, the fixed model counterbalances this disadvantage with a significant advantage: since any node designated to combine two operands knows which specific operands it is to combine, it can recognise and store the first of the operands separately and wait for the second operand to arrive. The flexible model cannot do this and must perform opportunistic combinations. It is an open problem to try and establish if the computation time of Fixed Random-Compute is stochastically dominated by the computation time of Flexible Random-Compute or vice-versa.

In next section, we will discuss the proofs of theorems discussed in this section.

5 Proofs

In this section, we present the detailed proofs of all the theorems. First, we prove the bounds on the stable rate of computation for the fixed model. Then, we give an upper bound on the average function computation time of the fixed and flexible model in terms of the basic random walk primitives: the hitting time and the mixing time respectively.

5.1 Rate Analysis

Szpankowski defines the stable data rate as the rate which ensures that the queues at all nodes in $V$ are bounded (see Section 4). Now, we discuss the proof of the theorem giving bounds for such stable rate of computation for Fixed Random-Compute algorithm. We prove this theorem using an existing rate result of [14] and [4].

Proof of Theorem 1.

First, we will prove a lower bound on the stable rate of the Fixed Random-Compute . For this we will use the lower bound on the rate of data collection by an algorithm similar to ours. In an earlier work [14] authors propose an algorithm (called Random-Collect) collecting data packets from source nodes to a designated node called sink. In their setting, there is no combination of data packets and all the source data packets travel till they reach the sink. Random-Collect algorithm is a generalised algorithm where data packets perform random walk on the network and the specific type of random walk would depend on the transition probability of moving from one node to another like the transition probability given by Eq. 1 allows packets to move in a simple random walk fashion. We will use the following simple random walk specific result from [14] which proves a lower bound on the stable data rate in that paper.

Lemma 1 (Network throughput lower bound [14]).

For a given graph $G=(V,E)$ with $|V|=n$ nodes, source set $V_{s}\subseteq V\setminus\{u_{\mbox{\scriptsize s}}\}$ with $|V_{s}|=s$ data sources, each generating data as independent Bernoulli arrivals with stable data arrival rate $\beta$ , and a single sink, $u_{\mbox{\scriptsize s}}$ , we have that

[TABLE]

where $\lambda_{2}$ is the second largest eigenvalue of transition matrix $\mathcal{P}$ of simple random walk on graph $G$ , $d_{min}$ and $d_{max}$ are the minimum and maximum degree of nodes of the graph $G$ respectively. These results hold for $\beta<\beta^{*}$ , where $\beta^{*}$ is the critical rate below which data rates are stable and above which they are unstable.

Recall that the schema to compute the function $f_{K}$ on $K$ sources is given by a binary tree $\mathcal{T}$ to Fixed Random-Compute algorithm. Thus, there are $K-1$ coalescences of data packets taking place while computing the function. Recall that in Fixed Random-Compute algorithm any coalescence, say $\hat{a}$ of data packets $x_{i},x_{j},$ can happen only at a particular node $u$ in the network and this is given by a mapping $\phi$ from $\mathcal{T}$ to the vertex set $V$ of the network $G.$ In this case, the node $u$ acts like a source to the data packets generated by the coalescence $\hat{a}$ which is used for the next level coalescence. We restrict the coalescence $\hat{a}$ which happens at node $u.$ This reduces the problem to the collection $x_{i},x_{j}$ by node $u.$ By Eq. 2, the rate of collection of $x_{i},x_{j}$ by $u$ using Fixed Random-Compute is $\beta_{ij}\geq\frac{1-\lambda_{2}}{2\sqrt{3}}\left(\frac{d_{min}}{d_{max}}\right)^{1/2}.$ Note that while achieving the rate $\beta_{ij}$ , Fixed Random-Compute might be using some edge of $G$ at its full capacity. There are $(K-1)$ such coalescences happening in $G$ at the same time thus the rate achieved by each of them without exceeding any edge capacity is $\beta_{ij}\geq\frac{1-\lambda_{2}}{2\sqrt{3}(K-1)}\left(\frac{d_{min}}{d_{max}}\right)^{1/2}.$ Thus, the stable rate of Fixed Random-Compute is lower bounded by $\beta\geq\frac{1-\lambda_{2}}{2\sqrt{3}(K-1)}\left(\frac{d_{min}}{d_{max}}\right)^{1/2}.$

We now turn to the upper bound. Authors in [4] study the rate of function computation when the function $f_{K}$ is a divisible function. A function $f_{K}$ is said to be divisible if for any partition of its source data set, $f_{K}$ can be computed by computing a local operation on any set and aggregating the result [4]. In other words, they study functions whose source data packets can be combined in any order. In our case, source data packets of $f_{K}$ can be combined only by the order given by schema $\mathcal{T}.$ Thus, the rate achieved in [4] is an upper bound on the rate achieved by our algorithm. Authors in [4] show that maximum stable rate for a divisible function is the min-mincut of the network graph $G.$ We define the equivalence of min-mincut for the transition matrix of our algorithm (transition probability given by Eq. 1) for $G$ as follows: recall that $u_{\mbox{\scriptsize s}}$ is the sink node in $G$ where the final computed function is collected and let $i$ be any node in the network. Then, the mincut between node $i$ and sink $u_{\mbox{\scriptsize s}}$ is defined as:

[TABLE]

Further, the min-mincut of $G$ is defined as $\delta:=\min_{i\in V}\delta_{i}.$ Following the same proof technique as that of Theorem 1 of [4] we get that the maximum stable rate $\beta$ achieved by Fixed Random-Compute algorithm is upper bounded by $\delta.$

∎

5.2 Computation Time Analysis

Next, we discuss the latency in function computation or the average time taken to compute the function under the two given models: fixed and flexible. We start with the proof of theorem dealing with the analysis of the computation time for the fixed model and then, present the proof of the flexible model computation time theorem. In both proofs, we first find the expected maximum time taken by $\ell$ data packets to appear at each source node and then, we find the expected time taken to compute the function for such $\ell$ rounds of data generated. Since, all data rounds are independent we find this time by analysing the single round computation time i.e. the time taken to compute the function value for the $i^{th}$ data round comprising of $|V_{s}|=K$ data packets with round identifier $i$ . For the fixed model, we find the single round computation time by recursively reducing our problem to the hitting event of data packets from two nodes onto a designated third node and then we present an upper bound on the time taken for one such event using the techniques and queueing delay analysis of [14]. For the flexible model, single round computation time proof involves a major modification of the proof of Cooper et. al.’s main theorem [9] and also uses techniques from [14] to handle queueing delay.

Proof of Theorem 2.

Given a graph $G$ where each node in source set $V_{s}$ receives independent Bernoulli arrivals with stable rate $\beta$ . Each data round generated has a total of $|V_{s}|=K$ data packets and each round has its appearance and computation time in the network. Instead of individually finding such times for each round, we will proceed our analysis by finding the expected maximum time by which $\ell$ rounds of data arrival have happened and then after this time, we find the expected function computation time assuming all $K\ell$ packets have appeared.

Let $\tau_{\mbox{\scriptsize app}}^{\ell}$ be the appearance time of $\ell$ rounds of data arrival on each node using Fixed Random-Compute algorithm for communication. Let $f$ be a hypothetical node where we assume packets reside before arriving at the network nodes, then, $\forall u\in V_{s}$ and $1\leq j\leq\ell$ , we have $X_{u,j}^{0}=f$ , where $X_{u,j}^{t}$ is the random variable denoting the position of $j^{th}$ round data packet of node $u$ at the start of time slot $t$ . If the $j^{th}$ data item appears at node $u$ at time $t^{\prime}$ then $X_{u,j}^{t}=f$ for all $t<t^{\prime}$ and $X_{u,j}^{t^{\prime}}=u$ . With this notation we can define appearance time as: $\tau_{\mbox{\scriptsize app}}^{\ell}=\min_{t}\{X_{u,j}^{t}\in V:1\leq j\leq\ell,u\in V_{s}\}$ . So, $\tau_{\mbox{\scriptsize app}}^{\ell}$ is the earliest time when $\ell$ packets have appeared at each node. Now, we find the expected appearance time of $\ell$ rounds.

Let $A$ be the event that a source node $u\in V_{s}$ did not receive $\ell$ arrivals in time $t_{u}$ . Consider, $t_{u}\geq 2\ell$ and $\beta\leq 1/2$ . So, we have

[TABLE]

As, $t_{u}\geq 2\ell$ , $\binom{t_{u}}{\ell-i}\leq\binom{t_{u}}{\ell-(i-1)}\forall i:1\leq i-1\leq\ell$ also, as $\beta\leq 1/2$ , so Eq. (3) can be written as, $\mathbb{P}\left[A\right]\leq\ell\binom{t_{u}}{\ell}(1-\beta)^{t_{u}}$ . Now, let $t_{u}=w\ell$ for $w>1$ , so using the fact that $\binom{t_{u}}{\ell}\leq(\frac{et_{u}}{\ell})^{\ell}$ previous equation can be rewritten as, $\mathbb{P}\left[A\right]\leq\ell(ew(1-\beta)^{w})^{\ell}$ . So, we can easily find $w$ such that $ew(1-\beta)^{w}\leq 1/e$ . Note that the value of $w$ will be only $\beta$ dependent as $w=\frac{b}{\beta}$ for some constant $b>1$ . So, for a node $u\in V_{s}$ , $\mathbb{P}\left[t_{u}\geq w\ell\right]\leq\ell e^{-\ell}$ . Since, all random walks of data packets from all $K$ nodes are independent of each other, considering worst-case analysis of all nodes we have,

[TABLE]

Now, we need to find the expected value of the maximum appearance time of packets at $K$ nodes let it be $\hat{t}_{u}=\{t_{u}:\exists u\leavevmode\nobreak\ t_{u}\geq w\ell\}$ . We have,

[TABLE]

So, we get

[TABLE]

So, we have the expected maximum appearance time of data packets at $K$ source modes as at most $w\log eK$ where $w=\frac{b}{\beta}$ for some constant $b>1$ . Now considering the worse-case analysis, the maximum time it takes for complete $l$ rounds of data packets to appear at all $K$ source nodes is,

[TABLE]

Note that after $\tau_{\mbox{\scriptsize app}}^{\ell}$ time all nodes have $\ell$ arrivals. Let $\tau_{\mbox{\scriptsize comp}}(\ell)$ be the expected function computation time of $\ell$ data rounds given the data packets have arrived at all source nodes. Now, we know because of the appearance time of rounds, random walks of data packets across the rounds are independent, so we can write $\tau_{\mbox{\scriptsize comp}}(\ell)\leq\ell\tau_{\mbox{\scriptsize comp}}(1)$ .

Now, to find the single round function computation time $\tau_{\mbox{\scriptsize comp}}(1)$ we perform the following analysis. Recall that the computation schema for function $f_{K}$ is given by a binary tree $\mathcal{T}$ with height $h$ where $\log K\leq h<K$ . If $\mathcal{T}_{i}$ is a complete subtree of binary tree $\mathcal{T}$ with its root at level $i$ , then let $\tau_{\mathcal{T}_{i}}$ be the coalescence time of $\mathcal{T}_{i}$ . We define the coalescence time with respect to three vertices $x,y,z\in V$ as $\tau_{(x,y;z)}=\max\{\tau_{(x,z)},\tau_{(y,z)}\}$ where $\tau_{(a,b)}$ is the time taken for a random walk starting from $a$ to first hit $b$ , so for $\mathcal{T}_{2}$ we have $\tau_{\mathcal{T}_{2}}=\tau_{(\phi(1,0),\phi(1,1);\phi(0,0))}$ .

In general, considering the worse case scenario we have

$\tau_{\mathcal{T}_{i}}=\tau_{(\phi(1,0),\phi(1,1);\phi(0,0))}+\max\{\tau_{\mathcal{T}_{i-1,0}},\tau_{\mathcal{T}_{i-1,1}}\},$ where $\mathcal{T}_{i-1,0}$ and $\mathcal{T}_{i-1,1}$ are the two subtrees of the root with height $i-1$ . Applying the recursion again we get $\tau_{\mathcal{T}_{i-1,0}}=\tau_{(\phi(2,0),\phi(2,1);\phi(1,0))}+\max\{\tau_{\mathcal{T}_{i-2,00}},\tau_{\mathcal{T}_{i-2,01}}\},$ and $\tau_{\mathcal{T}_{i-1,1}}=\tau_{(\phi(2,2),\phi(2,3);\phi(1,1))}+\max\{\tau_{\mathcal{T}_{i-2,10}},\tau_{\mathcal{T}_{i-2,11}}\}.$ Observing that for any four random variables $W,X,Y,Z$ , $\max\{W+X,Y+Z\}\leq\max\{W,Y\}+\max\{X,Z\}$ we get

[TABLE]

Therefore we get

[TABLE]

Note if we have a skewed binary tee schema with $h=K-1$ then in the above equation many variables of form $\tau_{(\cdot,\cdot)}$ will not be defined whereas for the complete binary tree schema with $h=\log K$ all such variables are present. Now, from the definition of coalescence time we know that each coalescence consists of two hitting time events. So, for each coalescence of form $\tau_{(x,y;z)}$ let us denote two random variables $\tau_{(x,y;z)}^{1}$ and $\tau_{(x,y;z)}^{2}$ for the two hitting time events. Let set $B$ be defined as $B:=\{\tau_{(\phi(i,2j),\phi(i,2j+1);\phi(i-1,j))}^{k}:1\leq i\leq h,0\leq j\leq 2^{i-1}-1\text{ and }k=1,2\}$ . Note that there are always $2(K-1)$ random variables of form $\tau_{(x,y;z)}^{k}$ for $k=1,2$ on the right hand side of Eq. (6). Now, let the hitting time for the $i^{th}$ out of $2(K-1)$ hitting event in the Markov chain associated with the simple random walk (with no delay) be $\tau_{(x,y;z)}^{i}$ for node $x,y,z\in V$ . So, we have $\mathbb{E}[\tau_{(x,y;z)}^{i}]\leq t_{\mbox{\scriptsize hit}}$ where $t_{\mbox{\scriptsize hit}}=\max\limits_{x,y\in V}\mathbb{E}_{x}(\tau_{y})$ is the worst-case hitting time of the simple random walk starting from any node of the graph. By Markov’s inequality $\mathbb{P}\left[\tau_{(x,y;z)}^{i}\geq et_{\mbox{\scriptsize hit}}\right]\leq\frac{1}{e}$ . Now, consider the probability of the random walk not hitting $z$ in $we$ times the worst-case hitting time i.e. we consider $w\leavevmode\nobreak\ et_{\mbox{\scriptsize hit}}$ time and divide it into $w$ slots $(l=1,2,\cdot\cdot\cdot,w)$ of $et_{\mbox{\scriptsize hit}}$ each. By the Markov property of random walks, we know that the random walks in each of these slots are independent. Also, since we have used the worst case hitting time for bounding the probability of one slot using Markov’s inequality, this bound will hold true for any starting vertex at the start of any slot $l$ . So, we have $\mathbb{P}\left[\tau_{(x,y;z)}^{i}\geq w\leavevmode\nobreak\ et_{\mbox{\scriptsize hit}}\right]\leq\prod_{l=1}^{w}\frac{1}{e}\leq\frac{1}{e^{w}}$ . Now, let $\tau_{max}=\max_{\tau\in B}\tau_{(x,y;z)}^{i}$ , then $\mathbb{P}\left[\tau_{max}\geq w\leavevmode\nobreak\ et_{\mbox{\scriptsize hit}}\right]\leq 2(K-1)e^{-w}$ . Now, we have

[TABLE]

So,

[TABLE]

as both $\log eK>0$ and $t_{\mbox{\scriptsize hit}}>0$ . So without any queueing delay, the expected time taken by the data packets involved in $2(K-1)$ coalescences (hitting events) to hit their designated nodes is $t_{\mbox{\scriptsize hit}}\log eK$ . This analysis is done assuming there is only one packet in the queue of any node at any time.

Now, we analyse the delay caused due to more than one packets in the queue. Consider a packet generated at node $p\in V_{s}$ , let us call it $p$ after its generating node. Let $X_{p,1}^{t}=X_{p}^{t}$ denote the position of packet $p$ at time $t$ , we drop the subscript denoting the round number as we are dealing with one round function computation. Now consider for any node $u\in V$ ,

[TABLE]

We know,

[TABLE]

As, $\mathbb{P}\left[A\cap B\right]\leq\mathbb{P}\left[A\right]$ , we can write $\mathbb{P}\left[Q_{t}^{\beta}(u)=w\cap X_{p}^{t}=u\right]\leq\mathbb{P}\left[Q_{t}^{\beta}(u)=w\right]$ . Using this result in Eq. 8 we get

[TABLE]

Using Eq. 9 in Eq. 7, we get

[TABLE]

Now, we have

[TABLE]

Now, let $c_{t}(\beta)=\max_{u\in V}\mathbb{P}\left[Q_{t}^{\beta}(u)\geq 2\right]$ be the maximum delay probability at time $t$ over all nodes in set $V$ for given data rate $\beta$ . So, at stationarity the maximum delay probability converges to $c(\beta)$ i.e., $c(\beta)=\lim_{t\rightarrow\infty}c_{t}(\beta)$ . $c(\beta)$ is a continuous and increasing function of $\beta$ (see Claim 1 [14] for details) with $c(0)=0$ and $c(\beta)\rightarrow 1$ as $\beta\rightarrow\beta^{*}$ where $\beta^{*}$ is the critical rate below which data rates are stable and above which they are unstable. We know that any packet gets delayed at time slot $t$ , due to queue at a node $u,$ because it is not picked for transmission in that slot among all the packets in the queue. Thus, the probability of a packet not being delayed by a node in a given time slot $t$ is $1-c(\beta)$ . So, combining the queueing delay with the non-delayed hitting time events, we have the expected number of steps taken by the packets involved in $2(K-1)$ coalescences (hitting events) to hit their designated nodes as $\frac{t_{\mbox{\scriptsize hit}}\log eK}{1-c(\beta)}$ . This expected time is computed without considering any dependence among the hitting events. Now, we use this result in Eq. (6) to capture the worse case relation between different levels of binary tree schema and hence the associated hitting time events. So, we have $C(K)=\tau_{\mathcal{T}_{h}}\leq h\dfrac{t_{\mbox{\scriptsize hit}}\log eK}{1-c(\beta)}$ .

So, we have one round function computation time i.e., assuming each source node has generated only one data packet as $\tau_{\mbox{\scriptsize comp}}(1)=C(K)=h\dfrac{t_{\mbox{\scriptsize hit}}\log eK}{1-c(\beta)}$ , where $c(\beta)\in[0,1]$ is a continuous and increasing function of $\beta$ representing the maximum delay probability in the limit over all nodes in $V$ , $t_{\mbox{\scriptsize hit}}$ being the worst-case hitting time of simple random walk on $G$ . So,

[TABLE]

Recall, $\tau_{\mbox{\scriptsize f(K)}}^{\ell}$ is the expected function computation time of $\ell$ rounds of data arrival and so, we can write

[TABLE]

Now, using the results from Eq.s (5) and (12), we have

[TABLE]

So, we have,

[TABLE]

where $b>1$ is a constant and $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of simple random walk on $G$ . ∎

For the flexible model computation time we have the following theorem proof.

Proof of Theorem 3.

Now, for the flexible model again we are given a graph $G$ with source nodes in set $V_{s}$ receiving independent Bernoulli data arrivals at stable rate $\beta$ . We will prove this theorem on same lines as that of fixed model computation time proof. We will first find the expected maximum time by which $\ell$ rounds of data arrival have happened and then after this time, we find the expected function computation time assuming all $K\ell$ packets have appeared given that $|V_{s}|=K$ .

Recall that $\tau_{\mbox{\scriptsize app}}^{\ell}=\min_{t}\{X_{u,j}^{t}\in V:1\leq j\leq\ell,u\in V_{s}\}$ is the appearance time of $\ell$ rounds of data arrival on each node using Flexible Random-Compute algorithm for communication, where $X_{u,j}^{t}$ is the random variable denoting the position of $j^{th}$ round data packet of node $u$ at the start of time slot $t$ . Using the same analysis as in the proof of Theorem 2, we get the expected time for complete $\ell$ data rounds to be generated at all $K$ source nodes as

[TABLE]

where $b>1$ is a constant. Note that after $\tau_{\mbox{\scriptsize app}}^{\ell}$ time all nodes have $\ell$ arrivals. Let $\tau_{\mbox{\scriptsize comp}}(\ell)$ be the expected function computation time of $\ell$ data rounds given the data packets have arrived at all source nodes. Now, we know because of the appearance time of rounds, random walks of data packets across the rounds are independent, so we can write $\tau_{\mbox{\scriptsize comp}}(\ell)\leq\ell\tau_{\mbox{\scriptsize comp}}(1)$ .

Now, we perform the following analysis to find the value of $\tau_{\mbox{\scriptsize comp}}(1)$ . Recall that the communication network is represented by graph $G=(V,E)$ with sources in set $V_{s}\subseteq V.$ The schema to compute function $f_{K}$ is a binary tree of height $h$ where $\log K\leq h<K$ thus has $h+1$ levels. We label the levels in the schema starting from the sink. Leaf level i.e. level $h$ simply acts as source of data for the function to be computed, hence nodes at this level perform identity function $\theta(i,j)=x_{j+1}$ for $0\leq j<2^{i}$ . However, in case of a non-complete binary tree source nodes may be present at other levels as well, so network nodes in $V_{s}$ always perform identity function. Nodes other than the source nodes with id $\mathcal{T}(i,j)$ at all other levels i.e. level $0\leq i<h$ compute the function $\theta(i,j)=\theta(i+1,2j)\oplus\theta(i+1,2j+1)$ , where $\oplus$ is the binary operation to be performed by node $\mathcal{T}(i,j)$ specified by function schema and $0\leq j<2^{i}$ . This subfunction computation can be seen as the coalescence of the data packets of the operands. For example, in Figure 1a, there are three levels with level $2$ providing four data operands, level $1$ performing two coalescences on the respective operands and level [math] performing one coalescence. So, because of the in-network computation paradigm each level $i+1$ acts as source for the level $i$ . Now, we will first analyse the computation time for $(h-1)^{th}$ level and replicate it for other levels. If the time required to complete $j^{th}$ level is $C_{k_{j}}$ then the total time to complete the function computation is $C(K)=\sum_{j=0}^{h-1}C_{k_{j}}=O(hC_{k_{(h-1)}}).$ For the simplicity of notations, we remove the subscript $j$ denoting the level number from various notations whenever it is clear from the context.

Multiple random walks to single random walk.

Given $k=2^{i}$ data packets from a single level, $i$ , of $\mathcal{T}$ let us consider the collection of random walks $\{X_{i,j}(t):0\leq j<k\}$ executed by these $k$ packets in $G$ . We represent this collection as a single random walk on the product graph $Q_{k}=(V_{Q_{k}},E_{Q_{k}})$ where $V_{Q}=V^{k}.$ Consider the set

[TABLE]

Note that when the random walk on $Q_{k}$ visits any vertex $S_{k}$ it is equivalent to all the $k/2$ pairs of walks on $G$ simultaneously coalescing which is what we require for computing the collection of functions $\{\theta(i,2l)\oplus\theta(i,2l+1):0\leq l\leq 2^{i-1}-1\}$ .

In other words, each node $\textbf{v}\in V_{Q_{k}}$ is a $k$ -tuple of nodes $\{v_{1},\ldots,v_{k}\}$ where each $v_{i}\in V.$ Thus, $k$ random walks $X_{u_{i}}(t)$ on graph $G$ starting from $u_{i}\in V_{s}$ can be replaced by a single random walk $X_{\textbf{u}}(t)$ on graph $Q_{k}$ with starting position $\textbf{u}=(u_{1},u_{2},\ldots,u_{k})$ . We further reduce the graph $Q_{k}$ to a graph $\Gamma_{k}$ by contracting the set $S_{k}$ to a single vertex $\gamma_{k}$ while retaining all other vertices, edges and loops. Thus, the degree of every vertex of $\Gamma_{k}$ is same as that of $Q_{k}$ except for the vertex $\gamma_{k}.$ The degree of $\gamma_{k}$ is the sum of degrees of all vertices of set $S_{k}$ and is given by $d_{\gamma_{k}}.$ Also if $\pi$ and $\hat{\pi}$ are the stationary distributions of a random walk on graphs $Q_{k}$ and $\Gamma_{k}$ respectively, then $\hat{\pi}_{v}=\pi_{v}$ for $v\notin S_{k}$ and $\hat{\pi}_{\gamma_{k}}=\pi_{S_{k}}=\sum_{x\in S_{k}}\pi_{x}$ .

Hitting time from stationarity.

Recall that the probability of a random walk to move from vertex $u$ to $v$ in graph $G$ is given by $\mathcal{P}[u,v]$ (defined by Eq. 1) in our algorithm. Let $\pi$ be the stationary distribution of our random walk in $G$ . Let $\mathcal{P}^{t}[u,\cdot]$ be the probability distribution of the Markov chain associated with the given random walk that begins at state $u$ at time $t$ . Then, the distance of this distribution from its stationary distribution $\pi$ at time $t$ is defined as $d(t):=\max_{u\in V}\lvert|\mathcal{P}^{t}[u,\cdot]-\pi|\rvert_{TV}$ . Then, for any $\epsilon>0$ mixing time $t_{\mbox{\scriptsize mix}}^{G}$ of the random walk on $G$ with transition probability given by Eq. 1, is given by

[TABLE]

This measures the time required by the Markov chain for the distance to the stationarity to be small. The mixing time for graphs $Q_{k}$ and $\Gamma_{k}$ are defined similarly and are denoted by $t_{\mbox{\scriptsize mix}}^{Q_{k}}$ and $t_{\mbox{\scriptsize mix}}^{\Gamma_{k}}$ respectively. We use the following relation among the mixing times of the three graphs in our proof.

Lemma 2.

*([9], Lemma 2)

Mixing time of simple random walk with transition probability given by Eq. 1 on graphs $G,Q_{k},\Gamma_{k}$ are given as*

[TABLE]

such that

[TABLE]

where $F$ can be any of the graphs $G,Q,\Gamma,$ $n_{F}=\lvert V_{F}\rvert$ and $1-\lambda_{2}$ is the spectral gap of the transition matrix of the simple random walk.

Proof.

The bound on mixing time of $G$ directly follows from Theorem 12.3 of [25]. Relation between the mixing times for $Q_{k}$ and $\Gamma_{k}$ with that of $G$ follow directly from the proof of Lemma 2 of [9]. ∎

Let $\mathbb{E}_{\pi}(\tau_{v})$ be the expected hitting time of a vertex $v$ starting from the stationary distribution $\pi.$ From Proposition 10.19 of [25], we can write

[TABLE]

where,

[TABLE]

Let $M_{k}(\textbf{u})$ be the time until the first two random walks meet in $Q_{k}$ when they start from $\textbf{u}.$ In other words, $M_{k}(\textbf{u})$ is the time to reach the vertex $\gamma_{k}$ in graph $\Gamma.$ Now, if mixing time $t_{\mbox{\scriptsize mix}}^{\Gamma}$ of graph $\Gamma$ satisfies Eq. 18, then

[TABLE]

where $\mathbb{E}_{\pi}(\tau_{\gamma_{k}})$ is the hitting time of vertex $\gamma_{k}$ from stationary distribution $\pi$ in graph $\Gamma$ . The following lemma follows directly from Lemma 3 of [9].

Lemma 3.

Let the spectral gap of the transition matrix $\mathcal{P}[\cdot,\cdot]$ of simple random walk on graph $F$ be $1-\lambda_{2}$ , then for any vertex $v$ of graph $F,$ we have

[TABLE]

where $Z_{vv}$ is defined in Eq. 20. This lemma holds for all three graphs $G,Q_{k}$ and $\Gamma_{k}$ .

Now we prove the following Lemma which will be used to prove the time for single level coalescence.

Lemma 4.

Let $G$ be a connected graph with $n$ vertices and $m$ edges and let

[TABLE]

where $\nu=\sum_{v\in V}{(\mbox{deg}(v))^{2}}/{d^{2}n}$ . Let $k$ be an integer, $2\leq k\leq l^{*}$ and $\gamma_{k}$ be the vertex in graph $\Gamma$ representing the contracted set $S_{k}.$ Then, there exists a constant $c_{k}>0$ such that,

[TABLE]

Proof.

Recall the definition of the set $S_{k}$ which is as follows:

[TABLE]

If $k=2,$ then, the degree of set $S_{k}$ is:

[TABLE]

If $3\leq k\leq l^{*}$ and for any $x,y$ such that $0\leq x<y<k$ , we define subsets of $S_{k}$ as:

[TABLE]

We get, $S_{k}=\underset{0\leq x<y<k}{\cup}S_{(x,y)}$ , where $\mbox{deg}(S_{(x,y)})=(2m)^{k-2}\sum_{v\in V}(\mbox{deg}(v))^{2}=(2m)^{k}\frac{\nu}{n}$ .

By the definition of our function $f_{K},$ for $\{x,y\}\neq\{p,q\}$ , $\{x,y\}\cap\{p,q\}=\phi$ . So, $\mbox{deg}(S_{(x,y)})\cap S_{(p,q)})=(2m)^{k-4}\sum_{u,v\in V}(\mbox{deg}(u))^{2}(\mbox{deg}(v))^{2}$ . Thus by inclusion-exclusion principle,

[TABLE]

The factor of $k/2$ in Eq. 23 is the result of fixed combinations among source nodes because of restricted coalescence defined by function and the combinatorial factor in other term is because of disjoint nature of nodes i.e. $\{x,y\}\cap\{p,q\}=\phi$ . The bound in Eq. 25 follows from Eq. 24, by using upper bound on $k$ from Eq. 22. Then, using bound on $\mbox{deg}(\gamma_{k})$ from Eq. 26 for $\pi_{\gamma_{k}}$ , we get the desired result. ∎

Recall that $\mathbb{E}_{\pi}(\tau_{\gamma_{k}})$ is the expected time to hit the vertex $\gamma_{k}$ from stationary distribution $\pi.$ Then by Lemma 3 and Lemma 4 we get,

[TABLE]

Computation time without queueing delay for the simple random walk.

Recall that $M_{k}(\textbf{u})$ is the time of coalescences of $k\leq l^{*}$ random walks in $G$ i.e, the time to hit $\gamma_{k}$ in graph $\Gamma$ which is given by Eq. 21. By Lemma 2 and Eq. 27 we get,

[TABLE]

Eq. 29 gives the bound for expected first coalescence time among $k\leq l^{*}$ data packets, indicating partial computation of $(h-1)^{th}$ level of function schema. Now, for complete computation of $(h-1)^{th}$ level, we will prove that with probability $1-1/n^{c}$ , where $c>0$ is some constant there cannot be a subset of $k=l^{*}$ data packets which did not coalesce by time $t^{*},$ where $t^{*}=l^{*}\log{K}(t_{\mbox{\scriptsize mix}}^{\Gamma}+3E_{\pi}(\tau_{\gamma_{k}})).$ To prove this we need the following result.

Lemma 5.

*([9], Lemma 1)

The probability of the event $N^{t}(u,v)$ such that a random walk starting from $u$ does not visit vertex $v$ in first $t$ time steps is given by*

[TABLE]

Here $T=t_{\mbox{\scriptsize mix}}^{G}$ is the mixing time of the simple random walk on $G$ and $\mathbb{E}_{\pi}(\tau_{v})$ is the expected hitting time of vertex $v$ from stationary distribution $\pi$ .

Let $\mathcal{N}(k,\textbf{v})$ be the set of data packets starting from vertex $\textbf{v}=(v_{1},\cdots,v_{k}).$ There are two cases for coalescences of these packets. Either these data packets have coalesced during mixing time $t_{\mbox{\scriptsize mix}}^{\Gamma}$ , or they have not. For latter case, we can use Lemma 5 on graph $\Gamma_{k}$ with vertex $\gamma_{k}$ and $t=t^{*}$ . This gives us the probability that the data packets have not coalesced by time $t$ which is same as the probability of random walk $\hat{X}_{\textbf{v}}$ on graph $\Gamma_{k}$ not hitting vertex $\gamma_{k}$ by time $t.$ So, using Lemma 5, $\mathbb{P}(\mbox{No meeting among }\mathcal{N}(k,\textbf{v})\mbox{ before }t^{*})$ is:

[TABLE]

So,

[TABLE]

The bound of earlier equation is achieved from value of $l^{*}$ in Eq. 22 and upper bound on binomial coefficient. So, expected number of steps until fewer than $l^{*}$ data packets remain is at most $t^{*}+\frac{1}{2}(2t^{*})+\frac{1}{4}(3t^{*})+...=4t^{*}$ . This is because after every $t^{*}$ step, number of data packets reduces by half due to coalescing. Now combining the above mentioned result with Eq. 29, we can get the coalescence time for $(h-1)^{th}$ level as,

[TABLE]

Now, for $l^{*}=\log{K}$

[TABLE]

Bound of Eq. 30 holds even if $l^{*}<\log{K}$ . Note that this analysis holds true for coalesced packets as well, as after coalescing they are treated in similar way as data packets which originated from source set $V_{s}$ . So, repeating similar analysis for all $h$ levels we get total function computation or coalescence time without any queueing delay as:

[TABLE]

By Lemma 2, the total coalescence time in terms of the mixing time is given by:

[TABLE]

Eq. 31 and Eq. 32 give the one round function computation time assuming only one packet inside each queue. Now, we will find the actual function computation time including queueing delays.

Incorporating queueing delay.

We will use the same queueing analysis as used in the proof of Theorem 2 to get a bound on the probability of a packet being delayed by a node at time $t$ . So, we have $\mathbb{P}\left[\mbox{Packet }p\mbox{ is delayed at }u\mbox{ at time }t\right]\leq\sum_{w\geq 2}\mathbb{P}\left[Q_{t}^{\beta}(u)=w\right]=\mathbb{P}\left[Q_{t}^{\beta}(u)\geq 2\right]$ and we know,

[TABLE]

So, let $c_{t}(\beta)=\max_{u\in V}\mathbb{P}\left[Q_{t}^{\beta}(u)\geq 2\right]$ be the maximum delay probability at time $t$ over all nodes in set $V$ for given data rate $\beta$ . Then, at stationarity we can say that the maximum delay probability converges to $c(\beta)$ where $c(\beta)=\lim_{t\rightarrow\infty}c_{t}(\beta)$ and is a continuous and increasing function of $\beta$ (see Claim 1 [14] for details). Thus, the probability of a packet not being delayed by a node in a given time slot $t$ is $1-c(\beta)$ . So, combining the queueing delay with the non-delayed computation time, we have the expected number of steps by which the function is computed as

[TABLE]

So, we have one round function computation time i.e., assuming each source node has generated only one data packet as $\tau_{\mbox{\scriptsize comp}}(1)=C(K)=O\Bigg{(}\dfrac{ht_{\mbox{\scriptsize mix}}^{G}}{1-c(\beta)}\bigg{(}\log^{3}{K}+\dfrac{n}{\nu}\frac{\log{K}}{\log n}\bigg{)}\Bigg{)}$ , where $c(\beta)\in[0,1]$ is a continuous and increasing function of $\beta$ and $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of simple random walk on $G$ . So,

[TABLE]

where $D>1$ is a constant. Recall, $\tau_{\mbox{\scriptsize f(K)}}^{\ell}$ is the expected function computation time of $\ell$ rounds of data arrival and so, we can write

[TABLE]

Now, using the results from Eq.s (16) and (33), we have

[TABLE]

So, we have,

[TABLE]

where $b,D>1$ are constants and $t_{\mbox{\scriptsize hit}}$ is the worst-case hitting time of simple random walk on $G$ . ∎

6 Conclusion and Future Work

In this paper, we have tried to demonstrate how random walk-based methods can be used for the in-network computation of a very general class of functions: asymmetric functions whose schema is described by a binary tree. We present lower and upper bounds on the rate for the fixed scenario. Our lower bound on rate though computed for our fixed setting is a general lower bound on rate of function computation. To the best of our knowledge, this is the first lower bound on rate for this class of problem. We also present the average function computation time under Bernoulli data generation model for both fixed and flexible model. However, our results hold for other data generation models as well, we will discuss our results in context of two different data generation models, some of the questions our setting and our results raise and also possible future directions of this work in the remaining part of this section.

Other data generation models

First we consider a realistic data generation model which is semi-deterministic in nature. In this model, given a $\beta>0$ each node $u\in V_{s}$ generates packet with sequence number $i$ ( $i>0$ ) at time $\frac{i}{\beta}+N_{v}(0,\gamma),$ where $N_{v}(0,\gamma)$ is a normal random variable with mean 0 and variance $\gamma$ and $\gamma>0$ is called the clock drift parameter. We assume that $\{N_{v}(0,\gamma):v\in V\}$ is an independent collection of random variables. So, for this model the maximum appearance time of $\ell$ rounds is $\tau_{app}^{\ell}\leq\frac{\ell}{\beta}+\max_{u\in V_{s}}N_{v}(0,\gamma)$ . So, using this in Eq. (13) and Eq. (34) we get the average function computation time under the given data generation model for the Fixed Random-Compute as $\dfrac{1}{\beta}+h\dfrac{t_{\mbox{\scriptsize hit}}\log eK}{1-c(\beta)}$ and for the Flexible Random-Compute as $\dfrac{1}{\beta}+\dfrac{ht_{\mbox{\scriptsize mix}}^{G}}{1-c(\beta)}\bigg{(}\log^{3}{K}+\dfrac{n}{\nu}\dfrac{\log{K}}{\log n}\bigg{)}$ where variables denote the usual quantities as discussed in earlier sections.

Now, consider the other data generation model which is continuous wherein each source node has data arrivals as Poisson process. This data generation model results in a Markov process defined by the queues of the nodes. Since, every Markov process has an embedded Markov chain and we already know that the Markov chain on the queues (as seen under the Bernoulli data generation model) achieves stationarity, so the results for the Bernoulli model also hold true for this model.

Discussion: How general is the binary tree function schema?

Note that any function of data consists of unary, binary or $M$ -ary operations. In the function computation schema, any intermediate node representing unary operation can be merged with its parent node, i.e., the unary operation can be performed by the network vertex which performs the operation for its parent node. Thus function computation schema with only $M$ -ary operations is general for $M\geq 2$ . For an $M>2$ there are two possibilities. One is to create a binary tree for an $M$ -ary operation by dividing the $M$ -ary function into a series of binary operations. Then the time required to complete the $M$ -ary function is equal to completing the operations of the equivalent binary tree which can be done using the techniques of Fixed and Flexible model analysis (see Section 5.2). Another way is to look at the $M$ -ary function as a symmetric function of $M$ data sources. In this case, time to compute the function can be computed using the techniques available in the literature for symmetric functions; see [28].

Future work: Comparing the Fixed scenario and the Flexible scenario

As discussed in Sections 4, currently we are not able to determine which of the two scenarios provides more efficient function computation. Proving that Flexible Random-Compute always does better than Fixed Random-Compute is one direction that is worth pursuing since otherwise, we are not able to justify the extra information Flexible Random-Compute needs to store (each node must know the entire function schema). However, on the grounds of fault-tolerance Flexible Random-Compute justifies the extra storage. It is more robust than Fixed Random-Compute since a single failure can disable the entire computation in the latter case if the failure occurs at a node which is tasked with computing a subfunction. In Flexible Random-Compute on the other hand, as long as the sources are connected to each other the computation can always take place since any node can perform any subfunction. Characterising the computation time performance of Flexible Random-Compute under suitable faults is an interesting direction to extend this work since most real-world sensor nodes tend to be failure prone.

Future work: A rate result

In terms of analysis, the decomposition of Fixed Random-Compute into a set of instances of data collection using random walks problem allows us to leverage the ideas developed in [14] to characterise the rate of computation under an independent Bernoulli data generation model. However, characterising the rate of Flexible Random-Compute is not as straightforward and presents an interesting challenge that we look forward to addressing in the future.

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Aldous and J. Fill. Reversible markov chains and random walks on graphs, 2002. (Monograph in preparation).
2[2] N. Alon, C. Avin, M. Kouckỳ, G. Kozma, Z. Lotker, and M. R. Tuttle. Many random walks are faster than one. Comb. Probab. Comput. , 20(4):481–502, 2011.
3[3] C. Avin and C. Brito. Efficient and robust query processing in dynamic environments using random walk techniques. In Proc. of the 3rd Intl. Symposium on Information Processing in Sensor Networks , IPSN ’04, pages 277–286. ACM, 2004.
4[4] S. Banerjee, P. Gupta, and S. Shakkottai. Towards a queueing-based framework for in-network function computation. Queueing Syst. , 72(3):219–250, 2012.
5[5] R. Beraldi, R. Baldoni, and R. Prakash. A biased random walk routing protocol for wireless sensor networks: The lukewarm potato protocol. IEEE Trans. Mobile Comput. , 9(10):1649 – 1661, 2010.
6[6] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah. Gossip algorithms: Design, analysis and applications. In Proc. of the 24th Annual Joint Conf. of the IEEE Computer and Comm. Societies , INFOCOM ’05, pages 1653–1664 vol. 3. IEEE, 2005.
7[7] K. Censor-Hillel and H. Sachnai. Fast information spreading in graphs with large weak conductance. SIAM J. Comput. , 41(6):1451–1465, 2012.
8[8] C. K. Chau and P. Basu. Exact analysis of latency of stateless opportunistic forwarding. In Proc. of the 28th Annual Joint Conf. of the IEEE Computer and Comm. Societies , INFOCOM ’09, pages 828–836. IEEE, 2009.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Random Walk-based In-network Computation of Arbitrary Functions

Abstract

1 Introduction

Our contributions

Related work

Paper organization

2 Modelling Assumptions and Computation Scenarios

2.1 Network and Computation Model

The network model

The model of computation in the network

The data generation model

2.2 The Binary Tree Function Schema

2.3 In-network Computation Scenarios

In-network computation scenario 1: The Fixed model

In-network computation scenario 2: The Flexible model

3 Our Algorithms

3.1 Routing Scheme Used by Our Algorithms

3.2 Fixed Random-Compute Algorithm

3.3 Flexible Random-Compute Algorithm

4 Performance Metrics and Our Results

Definition 1** (Stable rate of computation).**

Theorem 1**.**

Rate results: Discussion and examples

Definition 2** (Average function computation time).**

Theorem 2** (Fixed model computation time).**

Theorem 3** (Flexible model computation time).**

Computation time results: Discussion and examples

5 Proofs

5.1 Rate Analysis

Proof of Theorem 1.

Lemma 1** (Network throughput lower bound [14]).**

5.2 Computation Time Analysis

Proof of Theorem 2.

Proof of Theorem 3.

Multiple random walks to single random walk.

Hitting time from stationarity.

Lemma 2**.**

Proof.

Lemma 3**.**

Lemma 4**.**

Proof.

Computation time without queueing delay for the simple random walk.

Lemma 5**.**

Incorporating queueing delay.

6 Conclusion and Future Work

Other data generation models

Discussion: How general is the binary tree function schema?

Future work: Comparing the Fixed scenario and the Flexible scenario

Future work: A rate result

Definition 1 (Stable rate of computation).

Theorem 1.

Definition 2 (Average function computation time).

Theorem 2 (Fixed model computation time).

Theorem 3 (Flexible model computation time).

Lemma 1 (Network throughput lower bound [14]).

Lemma 2.

Lemma 3.

Lemma 4.

Lemma 5.