On the Theory of Dynamic Graph Regression Problem
Mostafa Haghir Chehreghani

TL;DR
This paper introduces a framework for efficiently updating linear regression solutions on dynamic graphs using update-efficient matrix embeddings, enabling fast recalculations after graph modifications.
Contribution
It defines update-efficient matrix embeddings and demonstrates how they enable fast updates of regression solutions in dynamic graphs, including adjacency and Laplacian matrices.
Findings
Exact regression solutions can be updated in O(nm) time after graph updates.
The approach works for both adjacency and Laplacian matrix embeddings.
Experiments show high efficiency on synthetic and real-world graphs.
Abstract
Most of real-world graphs are dynamic, i.e., they change over time by a sequence of update operations. While the regression problem has been studied for static graphs and temporal graphs, it is not investigated for general dynamic graphs. In this paper, we study regression over dynamic graphs. First, we present the notion of update-efficient matrix embedding, that defines conditions sufficient for a matrix embedding to be effectively used for dynamic graph regression (under l2 norm). Then, we show that given a n*m update-efficient matrix embedding (e.g., the adjacency matrix) and after an update operation in the graph, the exact optimal solution of linear regression can be updated in O(nm) time for the revised graph. Moreover, we show that this also holds when the matrix embedding is the Laplacian matrix and the update operations are restricted to edge insertion/deletion. In the end, by…
| Dataset | node insertion | node deletion | ||
|---|---|---|---|---|
| update time | scratch time | update time | scratch time | |
| wiki-vote | 193.76 | 1601.20 | 159.37 | 1451.28 |
| lastfm-asia | 38.08 | 261.41 | 31.06 | 256.83 |
| soc-sign-bitcoinotc | 23.94 | 102.99 | 21.31 | 101.14 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Human Mobility and Location-Based Analysis
On the Theory of Dynamic Graph Regression Problem
Mostafa Haghir Chehreghani
Department of Computer Engineering and
Information Technology,
Amirkabir University of Technology,
Tehran, Iran
Abstract
Most of real-world graphs are dynamic, i.e., they change over time by a sequence of update operations. While the regression problem has been studied for static graphs and temporal graphs, it is not investigated for general dynamic graphs. In this paper, we study regression over dynamic graphs. First, we present the notion of update-efficient matrix embedding, that defines conditions sufficient for a matrix embedding to be effectively used for dynamic graph regression (under norm). Then, we show that given a update-efficient matrix embedding (e.g., the adjacency matrix) and after an update operation in the graph, the exact optimal solution of linear regression can be updated in time for the revised graph. Moreover, we show that this also holds when the matrix embedding is the Laplacian matrix and the update operations are restricted to edge insertion/deletion. In the end, by conducting experiments over synthetic and real-world graphs, we show the high efficiency of updating the solution of graph regression.
Keywords
Dynamic graphs, linear regression, update-efficient matrix embeddings, update time
1 Introduction
Graphs are an important tool for modeling many complex systems such as the world wide web, social networks, road networks and citation networks. Most of real-world graphs are dynamic, i.e., they change over time by a sequence of graph update operations. A graph update operation can be either a node insertion, or a node deletion, or an edge insertion, or an edge deletion or an edge weight change.
A fundamental task in machine learning and data analysis is linear regression. In this problem, we receive data, where for each , the data consists of a row in a matrix and a single element in a vector . Then, the goal is to find a vector such that is the closest point to in the column span of , under a proper distance measure, such as the norm (Euclidean distance or the least squares distance) or the norm (the least absolute deviation). More formally, we want to solve the following problem ():
[TABLE]
While this problem has been studied for (static) high dimensional data, static graphs [1] and temporal graphs [21], to the best of our knowledge it is not investigated for general dynamic graphs. We refer to the regression problem over dynamic graphs as dynamic graph regression. In the current paper, we study the theory of dynamic graph regression, over general graphs. The importance of this problem stems from a wide range of applications that generate dynamic graphs (hence, many data analysis algorithms have been developed for them [3, 23, 31]). Examples of these graphs include the world wide web, social networks and collaboration networks. We focus on the most common form of Equation 1 wherein is , and is called least squares error (or * norm*) linear regression.
The challenges of regression over dynamic graphs are twofold. The minor challenge is a proper adaptation of the standard setting of the regression problem to graphs. As we will discuss later, this can be done by using a matrix embedding for the graph. The major challenge arises due to the dynamic nature of the graph. This means when the structure of the graph changes by an update operation, the already found solution must be updated, to become valid for the revised graph. This is not trivial, as updating the solution must be done in a time considerably less than the time of computing it from scratch, for the revised graph.
In the current paper, we aim to address these challenges. First, we present the notion of update-efficient matrix embedding that defines conditions sufficient for a matrix embedding to be used for the dynamic graph regression problem (under norm). We show that some of the standard matrix embeddings, e.g., the (weighted) adjacency matrix, satisfy these conditions. Then, we show that given a update-efficient matrix embedding, after an update operation in the graph, the exact optimal solution of the graph regression problem for the revised graph can be computed in time. In particular, using (weighted) adjacency matrix as the matrix embedding of the graph, it takes time to update the optimal solution, where is the number of nodes of the revised graph. Note that in this situation, computing the optimal solution for the revised graph from scratch will take time. We also show that similar results hold for the Laplacian matrix, if the update operations are limited to edge insertion and edge deletion. In the end, by performing experiments over synthetic and real-world graphs, we show the high empirical efficiency of the update method.
The rest of this paper is organized as follows. In Section 2, we provide an overview on related work. In Section 3, we introduce preliminaries and definitions used in the paper. In Section 4, we introduce update-efficient matrix embeddings. In Section 5, we study dynamic graph regression under least squares error. In Section 6, we present our experimental results. Finally, the paper is concluded in Section 7.
2 Related work
In recent years, several algorithms have been proposed for different learning problems over nodes of a static graph. Kleinberg and Tardos [26] studied the theory of the node classification problem and showed its connection to Markov random fields. With the number of node labels, they presented an -approximation algorithm which has a polynomial time complexity. Kovac and Smith [1] developed a non-parametric regression model for nodes of a graph, where the distance between estimated values (vector ) and measured values (vector ) is computed using norm. Han et al. [22] proposed a representation learning method optimized for regression over nodes of a graph. Unlike these algorithms that deal with static graphs, in this paper we concentrate on efficiently updating the exact solution of least squares linear regression, over dynamic graphs.
There are a number of relevant problems in the literature. One of them is regression over temporal graphs, wherein node features (node contents) and measured values change over time, but the structure of the graph remains unchanged. This is in contrast with our studied problem which deals with changes in the structure of the graph. For example, while in dynamic graph regression we want to efficiently update the regression solution after adding a new edge to the graph, in temporal graph regression it is desired to update the solution when the values of the income feature (which is one of node features) of some nodes change. A common technique used to solve temporal graph regression is to utilize structural dependencies among measured values [34, 38, 40]. Han et al. [21] presented a different approach which is based on jointly learning embeddings for the measured values and node features.
The other relevant problem is online learning over nodes of a graph [25, 24]. Online learning is used where it is computationally infeasible to solve the learning task over the entire dataset. However in our studied dynamic graph regression problem, the task can be solved over the entire graph and the challenge is to effectively update the solution by structural changes in the graph. In one of known online graph learning algorithms, Herbster and Pontil [25] used a perceptron for online label prediction of nodes of a graph.
Another problem which has some connection to our studied problem is learning embeddings (representations) for nodes or subgraphs of a graph. In this problem, each node or subgraph is mapped to a vector in a low-dimensional vector space. In recent years, several algorithms have been proposed for it [19, 41, 32, 7], even though this task dates back to several decades ago. For example, Parsons and Pisanski [33] presented vector embeddings for nodes of a graph such that the inner product of the vector embeddings of any two nodes and is negative iff and are connected by an edge; and it is [math] otherwise. For an overview on learning embeddings for dynamic graphs, interested readers may refer to [30, 17, 16].
While the above mentioned methods learn with nodes or edges of a single graph, there also exist several algorithms in the literature that learn with a population of graphs [12, 37, 11, 28, 13, 5]. For example, Calissano et al. [5] studied building a regression model between a set of real values and a set of graphs.
Our studied setting is usually called dynamical setting [20], wherein the structure of a single graph changes by means of a graph update operation, and the goal is to effectively update the solution of a problem for the new graph. The list of graph update operations that change the structure of the graph, is presented in Section 3. For a survey on different machine learning and data mining algorithms in the dynamical setting, interested readers may refer to [20].
3 Preliminaries
We use lowercase letters for scalars, uppercase letters for constants and graphs, bold lowercase letters for vectors and bold uppercase letters for matrices. We denote by the number of nodes of graph . A dynamic graph is a graph that changes over time by a sequence of graph update operations [20]. A graph update operation is an operation that inserts an edge or a node into the graph; or deletes an edge or a node and its incident edges from the graph; or changes the weight of an edge. We assume that when a new node in inserted, some edges are also added between the new node and the existing nodes of the graph. We also assume that the new node obtains the largest node id of the graph.
The weighted adjacency matrix of , denoted with , is a matrix where contains the weight of the edge from node to node , if has an edge to (otherwise, it is [math]). Given an undirected weighted graph , its weighted Laplacian matrix is a square matrix of size defined as , where is a diagonal matrix with and is called the (weighted) degree matrix. The Euclidean norm or norm of a vector of size , denoted with , is defined as Let . The rank of is the dimension of its column space (or its row space). By we denote the transpose of defined as an operator that switches the row and column indices of . The Singular Value Decomposition (SVD) of the matrix is defined as , where is a matrix with orthonormal columns, is a diagonal matrix with non-negative non-increasing entries on the diagonal, and is a matrix with orthonormal rows. The Moore-Penrose pseudoinverse matrix of , denoted by , is the matrix , where is a diagonal matrix defined as follows: , if , and [math] otherwise. It is well-known that the solution
[TABLE]
is an optimal solution (the closed form solution) for the linear regression problem under the norm, i.e., for the problem defined by Equation 1, with [35]. Time complexity of computing this solution is cubic in terms of and [35], which makes it slow and time consuming to compute it from scratch over large graphs.
4 Update-efficient matrix embeddings
By assuming that data are given in the form of a graph, we can extend the linear regression problem to the linear graph regression problem. In linear graph regression, we are given a graph , with nodes, and a vector . Then, we want to solve the following optimization problem:
[TABLE]
where must satisfy the following conditions: i) is well-defined, and ii) must produce a column vector. A straightforward and common way to define is to replace by vector embeddings of nodes of (a matrix embedding111We note that this notion of embedding is different from the notion of embedding used in graph pattern mining [8, 9, 10]. of ), denoted with . As a result, for each node in the graph, we define a row vector, and is defined as a column vector. Hence, the linear graph regression problem is converted into finding the closest point to , in the column span of the matrix generated by the vector embeddings of nodes of . In other words, we want to solve the following optimization problem
[TABLE]
As an example motivating linear graph regression problem, assume that we are given the graph of a social network, wherein each vertex is a person and the links represent the friendship relations. Moreover, a score is assigned to each vertex which determines e.g., its reputation. Now we want to find a function (with a least squares error) for the scores of the vertices, which is linear in terms of their structural properties. Therefore, we need to solve the linear graph regression problem for the given network. More precisely, first we need to compute a matrix embedding of the nodes of the network, wherein each row is a representation of the structural properties of node . Then we need to find a function for the scores which is linear in terms of rows of .
A property seen in real-world graphs is that they are usually dynamic. This means they frequently obtain/loose nodes or edges. As a result, after an update operation the solution found for the linear graph regression problem must be updated. This should be done in a time much less than the time of computing it from scratch, for the updated graph. In this section, we discuss properties that matrix embeddings suitable for updating the solution of graph regression should satisfy. Before that, using an example we discuss what happens to a matrix embedding, when an update operation occurs in the graph.
Depending on how we define the matrix embedding , deleting/inserting a node/edge may or may not result in changes in the vector embeddings of the other nodes. As an example, consider Figure 1 that shows a directed graph , a node deletion operation (Figure 1(a)), and a node insertion operation (Figure 1(b)). In this example, the matrix embedding of the graph is defined as follows: for each node in we keep a row of size , where the entry is the inverse of the distance (shortest path size) from to node . If there is no directed path from to or if , the entry will be [math]. In the figure, the row corresponding to each node shows its vector embedding, and the matrix consisting of these rows is the matrix embedding. For example, in the graph on the top figure (graph ), the vector embedding of node consists of the following elements: [math], and . First, consider deleting node from . Since for each node, we have a row and a column in the matrix embedding, we need to delete the row and column corresponding to node . However, this is not enough as it does not always yield a valid matrix embedding for the updated graph. For example, we need also to change the first element of the second row from to [math]. The reason is that after deleting node and its incident edges, there will be no path from node to node . The changes in the updated matrix embedding are depicted with red in Figure 1(a).
Then as shown in Figure 1(b), consider inserting a new node to . In order to update the matrix embedding with respect to this change, we need to add a column and a row corresponding to this new node. The entries of this row and this column are depicted with red in Figure 1(b). Fortunately, node and its edge to node do not change the distances between the existing nodes of the graph. So, rows/columns 1-3 of the matrix embedding do not need to change. It is worth highlighting that after detecting these changes in , matrix must be updated accordingly and the updated must be multiplied with to form the updated solution.
Definition 1**.**
*Let be a matrix embedding of graph and be a (complexity) function of and . We say is -update-efficient, iff the following conditions are satisfied:222Note that the graph and the information used to find the solution of dynamic graph regression, i.e., the pseudoinverse of matrix and vector , change over time. However, we only care about their values before and after an update operation, as we want to find their values after the update operation, based on their values before the update operation. In order to keep notations as simple as possible, we do not parameterize them by time, rather, we simply use the terms before and after the update operation to distinguish these two situations. *
an edge insertion/deletion or an edge weight change in result in a rank-* update in , where is a constant. More precisely, if and are correct matrix embeddings before and after one of the above-mentioned update operations in the graph, there exist at most pairs of vectors and (of proper sizes) such that:*
[TABLE]
We refer to each pair and as a pair of update vectors, and to as the update matrix. 2. 2.
a node insertion in results in adding one column and/or one row to and also at most a rank-* update in .* 3. 3.
deleting the last node (i.e., the node with the largest id) from results in deleting one column and/or one row from and also at most a rank-* update in .* 4. 4.
after an update operation in , it is feasible to compute all pairs of update vectors in time. 5. 5.
if in any node is permuted with the last node (i.e., with the node that has the largest id), this can be expressed, in time, in terms of a rank-* update in .*
Sometimes when is clear from the context, we simply drop it and use the term update-efficient.
Remark 1**.**
The update-efficient property of matrix embeddings is not closed under matrix addition and matrix subtraction, as they may increase the rank of the matrix.
In Section 5, we use update-efficient matrix embeddings to develop an efficient algorithm for dynamic graph regression. Some well-known matrix embeddings belong to this class of matrix embeddings. In Lemma 1 we show that (weighted) adjacency matrix is a -update efficient matrix embedding. Moreover, in Lemma 2 we show that under some conditions, the other widely used matrix embedding, i.e., the Laplacian matrix, provides a -update efficient matrix embedding, too.
Lemma 1**.**
Assume that is a simple, weighted and directed graph and its matrix embedding is defined as its weighted adjacency matrix. is a -update-efficient matrix embedding (i.e., ).
Proof.
We show that satisfies all the five conditions stated in Definition 1.
When an edge is inserted/deleted between nodes and or its weight changes, only the entry of is revised. Let denote the amount of this change in , which can be either positive or negative. To express it in terms of a pair of update vectors and , we only need to define e.g., as a vector (of size ) whose all elements are [math] except the element, which is ; and as a vector (of size ) whose all elements are [math], except the element which is . Obviously, yields a matrix whose all elements, except the element, are [math]; and its element is . Therefore, condition (1) of Definition 1 is satisfied. 2. 2.
When a new node is added to , we add a new column for it in , which contains the weights of its incoming edges (i.e., if there is an edge from a node to , we put the weight of this edge in the entry of the new column). Also, we add a new row for it in , which contains the weights of its outgoing edges. Furthermore, in we only need to update those entries whose column number or row number are . These entries are updated during column/row addition. Hence, no update vector is required. As a result, condition (2) of Definition 1 is satisfied. 3. 3.
When the last node is deleted from , we delete its corresponding column and row from . Furthermore, in we only need to update those entries whose column number or row number are . These entries are already deleted during column/row deletion. Therefore, no update vector is required and hence, condition (3) of Definition 1 is satisfied. 4. 4.
In the case of edge insertion/deletion and edge weight change, the update vectors and can be computed in time. In the case of the other update operations, there is no need for update vectors. As a result, condition (4) of Definition 1 is satisfied. 5. 5.
When in we permute a node with the last node, in we only need to first exchange the column with the last column and then in the resulted matrix, exchange the row with the last row. These two changes in can be expressed in terms of a rank- update matrix, as follows. We focus on exchanging the column with the last column, as exchanging the row with the last row can be done in a similar way. Let denote the index of the last column. First, note that we may consider exchanging the column with the column as adding to each entry in the column the value ; and adding to each entry in the column the value . Now let focus on the column (a similar procedure can be used for the column). We want to express the additions to the column in terms of a pair of update vectors and . We can define as a vector whose entry contains ; and as a vector whose all entries, except the entry, are [math] and its entry is . Clearly, yields a matrix whose column includes the values , for , and its other entries are [math]. As a result, exchanging the column with the column can be done using rank- pairs of update vectors. Moreover, computing vectors and can be done in time. Hence, condition (5) of Definition 1 is satisfied.
∎∎
Lemma 2**.**
Suppose that is a weighted, undirected and bounded-degree graph and its matrix embedding is defined as its weighted Laplacian matrix. is a -update-efficient matrix embedding333Note that when inserting a new node to a bounded-degree graph, at most a constant (bounded) number of edges are drawn between the new node and existing nodes..
Proof.
Assume that the degrees of the nodes of are bounded by a constant .
When an edge is inserted/deleted between nodes and or its weight changes, the entries , , and of might change. For each of these entries, similar to the first case of the proof of Lemma 1, we can express the change in terms of a pair of update vectors and . Hence, all these changes can be stated in terms of an update matrix whose rank is at most (which is generated by the sum of four rank- matrices) and as a result, condition (1) of Definition 1 is satisfied. 2. 2.
When a new node is added to , we add a new column for it in , whose entry is , where is the weight of the edge between nodes and (if there is no edge between and , is [math]). In a similar way, we add a new row for it in . Furthermore, in we need to increase by each entry such that has an edge to . Since the degrees are bounded by , the number of such revisions is at most . Each such revision, can be expressed by a pair of update vectors and , where the entries of these vectors are and the other entries are [math]. These (rank-) update vectors yield an update matrix whose rank is at most , as a result, condition (2) of Definition 1 is satisfied. 3. 3.
When the last node of is deleted, we delete its corresponding column and row from . Furthermore, in we need to decrease by each entry such that has an edge to . Similar to the case of node addition, this can be done by at most (rank-) update vectors, where the non-zero element of each vector is , rather than . As a result, condition (3) of Definition 1 is satisfied. 4. 4.
For all the update operations, the update vectors and can be computed in time. As a result, condition (4) of Definition 1 is satisfied. 5. 5.
When in we permute a node with the last node, in we need to update both and . On the one hand, can be updated in a way similar to the permutation case of the proof of Lemma 1 and this can be done in time. On the other hand, by permutation, weighted degrees of the nodes, except node and the last node, do not change. Therefore can be easily updated by exchanging the entry and the entry in the last row and last column of , which can be done by two pairs of update vectors. This can be done in time. Hence, condition (5) of Definition 1 is satisfied.
∎∎
Note that weighted Laplacian matrix (as well as unweighted Laplacian matrix), without the mentioned constraint on degrees, are not update-efficient, for any arbitrary complexity function . The reason is that without the mentioned constraint, node addition/deletion may change the degrees of nodes, which then may require a rank- update matrix (or pairs of update vectors). Nevertheless, if we restrict update operations, we might be able to use Laplacian matrix for the regression of general dynamic graphs. For example, if we limit update operations to edge insertion, edge deletion and edge weight change, it will not be necessary for nodes of the graph to have a bounded degree.
5 Dynamic graph regression under least squares error
In this section, we condition on the existence of update-efficient matrix embeddings and show that the exact optimal solution of graph regression can be updated in a time much faster than computing it from scratch.
At the high-level, the algorithm of solving dynamic graph regression consists of three phases:
- the matrix embedding phase, wherein we compute an update-efficient matrix embedding for the given graph (we assume that it is static),
- the pre-processing phase, wherein we assume that we are given a static graph and we find a solution for it, and
- the update phase, wherein after any update operation in , and the already found solution are revised to become valid for the new graph. For the pre-processing phase, we use as the optimal solution. Naively computing the SVD and pseudoinverse of requires time. However using fast matrix multiplication [15] (which is not practical!), it can be done in time. In the proof of Theorem 1, we discuss how the optimal solution is updated, after an update operation in .
Theorem 1**.**
Let be a update efficient matrix embedding of graph . After a pre-processing phase which takes time, after any update operation in the graph, the exact optimal solution of the graph regression problem for the updated graph can be computed in time.
Proof.
After an update operation in , we require to first update and then, update . The way of updating depends on the update operation done in the graph.
- •
Edge insertion/deletion or edge weight change: in any of these cases, due to the update-efficient property of , we have a sequence of at most rank- updates:
[TABLE]
for , where and are a pair of update vectors, and is the correct matrix embedding of after the update operation. After each rank- update , we may exploit e.g., the algorithm of Meyer [6] that given a matrix and its Moore-Penrose pseudoinverse and a pair of update vectors and , computes Moore-Penrose pseudoinverse of . Due to many notations and a long explanation required to introduce this method, we here omit its description and refer the interested reader to [6]. The key point is that given , computing can be done in time. Therefore and after applying this algorithm for at most times, we can compute the Moore-Penrose pseudoinverse of the matrix embedding of the updated graph in time.
- •
Node insertion: in this case, we need to follow a two-phase procedure. In the first step, we require to append a row and (if needed) a column to that correspond to the new node and carry out its embedding information. Let’s focus on appending a new column (appending a new row can be dealt with in a similar way). Speaking precisely, we have matrix
[TABLE]
where is the column corresponding to the new node and is the matrix embedding of before the update operation. We want to compute based on and . For this, we may use e.g., the Greville’s algorithm [18], which is as follows. Let , , and be the null matrix (of proper size), i.e., the matrix whose all elements are zero. Then
[TABLE]
where
[TABLE]
Since adding the new node may affect the vector embeddings of the existing nodes by a sequence of at most rank- update vectors, in the second step, we need to reflect these changes to the matrix embedding of . In other words, for at most pairs of vectors and , we have:
[TABLE]
where is and is the correct matrix embedding of after the node insertion. Hence, similar to the previous case, we may use the algorithm of Meyer [6] to compute based on . Each of these two steps takes time.
- •
Node deletion: in this case, we need to follow a three-phase procedure. In the first step, we perform a permutation on so that the node that we want to delete becomes the last node in the matrix (i.e., it becomes the node with the largest id). Note that updating with this new permutation of nodes may need to call a sequence of at most rank- update vectors. Hence, we may need to use Meyer’s algorithm [6] to compute each based on and the update vectors, where . In the second step, we need to delete a row and (if needed) a column from that correspond to the deleted node and carry out its embedding information. Let’s again focus on deleting a column (as deleting a row can be done in a similar way). We have matrix such that
[TABLE]
where is the column corresponding to the deleted node and is matrix embedding of before the update operation. We may again use the Greville’s algorithm [18] to compute based on and . Finally, since deleting a node may change the vector embeddings of the existing nodes by a sequence of at most rank- update vectors, in the third step, we need to apply these changes to the matrix embedding of the graph. Hence and similar to the previous cases, we can use Meyer’s algorithm [6] to compute each based on and the update vectors, with . Each of these three steps takes at most time.
As a result, after an update operation in , can be updated in time. Note that in the case of node deletion, when we perform a permutation on the nodes of and rows of , we need also consistently permute the elements of and then, remove from the measured value of the deleted node (which after the permutation will be the last element of ). These operations can be done in time. Furthermore, after a node insertion, we need to add the measured value of the new node to the end of , which can be done in time. A naive multiplication of the updated and the updated yields the optimal solution of the updated graph and it takes only time. ∎∎
Corollary 1**.**
Given a graph , if its matrix embedding is defined as its (weighted or unweighted) adjacency matrix, after any update operation in (i.e., node insertion or node deletion or edge insertion or edge deletion or edge weight change), the exact optimal solution of the dynamic graph regression problem can be updated in time.
Proof.
Lemma 1 says that the (weighted) adjacency matrix of is a -update-efficient and therefore, a -update-efficient matrix embedding of . Hence and as Theorem 1 says, after any update operation in the graph, the optimal solution of the graph regression problem for the updated graph can be computed in time. ∎∎
To the best of our knowledge, Corollary 1 provides the first result on updating the exact optimal solution of linear regression over general dynamic graphs, in a time less than computing it from scratch. Note that if the weighted adjacency matrix of is used as the matrix embedding, computing the optimal solution of dynamic graph regression from scratch will take time.
As mentioned earlier, if we restrict the update operations to edge insertion, edge deletion and edge weight change, the Laplacian matrix will be a -update-efficient matrix embedding. Hence, we will have the following result.
Corollary 2**.**
Suppose that we are given an undirected graph , whose matrix embedding is defined as its (weighted or unweighted) Laplacian matrix.
After an edge insertion or an edge deletion or an edge weight change in , we can update the exact optimal solution of the dynamic graph regression problem in time. 2. 2.
If is a bounded-degree graph, after any update operation in , i.e., node insertion (wherein a bounded number of edges are drawn between the new node and the existing nodes) or node deletion or edge insertion or edge deletion or edge weight change, the exact optimal solution can be updated in time.
In the end, it might be of interest to see whether similar results can be obtained for other regression models. Someone may start with linear regression under least absolute deviation, as it is the closest setting to our studied problem, and then go beyond linear models and study nonlinear regression functions for dynamic graphs. In dynamic graph regression under least absolute deviation, the distance is measured using the norm. More precisely, we want to solve the following problem:
[TABLE]
where the norm of a vector is defined as follows: . What makes this problem and most of the other nonlinear regression problems theoretically more challenging, is that they do not have a closed form solution [39], i.e., a solution like what is presented in Equation 2 for our studied problem. However, we believe it might be possible to derive similar results for some of these models, and leave it as an interesting direction for future work.
6 Experimental results
In this section, we empirically evaluate the update algorithm. First in Section 6.1, we report the results of our experiments over synthetic and real-world graphs. Then in Section 6.2, we discuss the applicability and usefulness of dynamic graph regression and its update algorithm, in a real-world case study.
6.1 Empirical evaluation
In this section, we examine the empirical efficiency of updating the solution of dynamic graph regression, compared to computing it from scratch. Since node insertion and node deletion include edge insertion and edge deletion, we only consider these two update operations. In node insertion, a new node along with some edges connecting it to the existing nodes are added to the graph. In node deletion, the node with the largest id is deleted from the graph. Each of these operations is conducted for times and the average results are reported. We define the matrix embedding of the graph as its adjacency matrix. We run both the update algorithm and the algorithm of computing the solution from scratch, and compare their times.
6.1.1 Synthetic graphs
We study how the update algorithm scales with respect to the size of the graph, compared to the algorithm that computes the solution from scratch. We generate several ER random graphs [14] with , , , , , , , and nodes, all with the same edge induction probability . Figures 2(a) and 2(b) present the results over the generated graphs, respectively for node insertion and node deletion. Due to large differences between update and scratch times, in the charts of these figures the vertical axes are presented in the logarithmic scale. As depicted in the figures, incrementally updating the solution rather than computing it from scratch takes much less time. Moreover, it is more scalable: it can handle large graphs within still an empirically tractable time. The time of computing the solution from scratch quickly increases by increasing the size of the graph, so that performing it for large graphs is intractable in practice.
6.1.2 Real-world graphs
We also evaluate the algorithms over the following real-world graphs: wiki-vote [29]444https://snap.stanford.edu/data/wiki-Vote.html, lastfm-asia [36]555https://snap.stanford.edu/data/feather-lastfm-social.html and soc-sign-bitcoinotc [27]666https://snap.stanford.edu/data/soc-sign-bitcoin-otc.html. Wiki-vote has 7115 nodes and 103689 edges, lastfm-asia has 7624 nodes and 27806 edges, and soc-sign-bitcoinotc has 5881 nodes and 35592 edges. The comparison results are presented in Table 1. They depict that updating the solution of linear regression is considerably more efficient than computing it from scratch.
We note than while wiki-vote and lastfm-asia have almost the same number of nodes (lastfm-asia has slightly more nodes), over wiki-vote both update time and scratch time are much larger. The reason is that wiki-vote is a considerably more dense graph than lastfm-asia. When computing the solution from scratch, we need to compute the pseudoinverse of the adjacency matrix of the updated graph. This operation is much more time consuming for dense graphs than sparse graphs. On the other hand, for dense graphs more pairs of update vectors are induced, after the update operation. Moreover, updating the pseudoinverse of the matrix is more time consuming. As a result, update time over a graph such as wiki-vote is larger than update time over a sparser graph such as lastfm-asia.
6.2 A case study
Cryptocurrency price prediction is a challenging task in market data analysis. Correlations between cryptocurrencies [4] can be useful in predicting cryptocurrency prices. In this section, we investigate applying a linear regression model to predict cryptocurrency prices. From the coinmetrics website777https://charts.coinmetrics.io, we collect the correlation information and the prices (in the US Dollar) of the following cryptocurrencies, in July 2021: Bitcoin (BTC), Decred (DCR), DigiByte (DGB), Dogecoin (DOGE), Litecoin (LTC), MCO Token (MCO) and Vertcoin (VTC). Figure 3(a) shows the weighted graph of the correlations, where the nodes are the currencies and the weight of an edge represents the correlation between its two endpoints. Figure 3(b) presents the prices of the cryptocurrencies. Our goal is to develop a linear model that predicts the price of a cryptocurrency, in the form of a linear combination of its correlations with the other cryptocurrencies. Since the correlations and the prices are dynamic and time-variant, we want to update the model by the changes in data.
To do so, we consider the weighted adjacency matrix of the correlation graph of Figure 3(a) as the matrix embedding , and the values presented in Figure 3(b) as the vector . We solve the corresponding linear regression problem (under norm). Its exact optimal solution is:
[TABLE]
This means that in the linear regression, the weight of the correlation with BTC is , the weight of the correlation with DCR is , the weight of the correlation with DGB is and so on.
When the correlation graph or vector change over time, we need to accordingly update the weights. We note that as stated in Section 4, weighted adjacency matrix used in this case study is an update efficient matrix embedding. So, the weights of the linear regression can be updated efficiently. An example change is the decrease of the correlation between BTC and DGB from to . After updating our linear regression model with respect to this change, the weights in vector become:
[TABLE]
7 Conclusion
In this paper, we studied the theory of linear regression over dynamic graphs. First, we presented the class of update-efficient matrix embeddings, that defines conditions sufficient for a matrix embedding to be used for least squares dynamic graph regression. Then, we showed that given a update-efficient matrix embedding (e.g., the adjacency matrix), after an update operation in the graph, the exact optimal solution of graph regression can be updated in time. Finally by conducting experiments over synthetic and real-world graphs, we showed the high empirical efficiency of the update algorithm.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Arne, K., Smith, A.D.: Nonparametric regression on a graph. Journal of Computational and Graphical Statistics 20 (2), 432–447 (2011). DOI 10.1198/jcgs.2011.09203 . Publisher: American Statistical Association
- 2[2] Balcan, M., Weinberger, K.Q. (eds.): Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, JMLR Workshop and Conference Proceedings , vol. 48. JMLR.org (2016). URL http://jmlr.org/proceedings/papers/v 48/
- 3[3] Borgwardt, K.M., Kriegel, H., Wackersreuther, P.: Pattern mining in frequent dynamic subgraphs. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 18-22 December 2006, Hong Kong, China, pp. 818–822. IEEE Computer Society (2006). DOI 10.1109/ICDM.2006.124 . URL https://doi.org/10.1109/ICDM.2006.124
- 4[4] Calissano, A., Feragen, A., Vantini, S.: Graph-valued regression: Prediction of unlabelled networks in a non-euclidean graph-space. In: MOX-Report No. 02/2021, Dipartimento di Matematica, Politecnico di Milano, Via Bonardi 9 - 20133 Milano, Italy (2021)
- 5[5] Calissano, A., Feragen, A., Vantini, S.: Graph-valued regression: Prediction of unlabelled networks in a non-euclidean graph space. Journal of Multivariate Analysis 190 , 104,950 (2022). DOI https://doi.org/10.1016/j.jmva.2022.104950 . URL https://www.sciencedirect.com/science/article/pii/S 0047259 X 22000021
- 6[6] Carl D. Meyer, J.: Generalized inversion of modified matrices. SIAM Journal on Applied Mathematics 24 (3), 315–323 (1973)
- 7[7] Chehreghani, M.: Half a decade of graph convolutional networks. Nature Machine Intelligence 4 , 1–2 (2022). DOI 10.1038/s 42256-022-00466-8
- 8[8] Chehreghani, M.H., Abdessalem, T., Bifet, A., Bouzbila, M.: Sampling informative patterns from large single networks. Future Gener. Comput. Syst. 106 , 653–658 (2020). DOI 10.1016/j.future.2020.01.042 . URL https://doi.org/10.1016/j.future.2020.01.042
