Multi-GCN: Graph Convolutional Networks for Multi-View Networks, with Applications to Global Poverty
Muhammad Raza Khan, Joshua E. Blumenstock

TL;DR
This paper introduces Multi-GCN, a graph convolutional network designed for multi-view networks, demonstrating superior performance in global poverty prediction and other multi-view learning tasks across various datasets.
Contribution
The paper presents a novel Multi-GCN model that effectively captures multi-view relations in graphs, improving semi-supervised learning in poverty research and beyond.
Findings
Outperforms state-of-the-art algorithms on poverty prediction tasks.
Achieves better results on multi-view node labeling in citation networks.
Effective across datasets from multiple developing countries.
Abstract
With the rapid expansion of mobile phone networks in developing countries, large-scale graph machine learning has gained sudden relevance in the study of global poverty. Recent applications range from humanitarian response and poverty estimation to urban planning and epidemic containment. Yet the vast majority of computational tools and algorithms used in these applications do not account for the multi-view nature of social networks: people are related in myriad ways, but most graph learning models treat relations as binary. In this paper, we develop a graph-based convolutional network for learning on multi-view networks. We show that this method outperforms state-of-the-art semi-supervised learning algorithms on three different prediction tasks using mobile phone datasets from three different developing countries. We also show that, while designed specifically for use in poverty…
| Dataset | Data Type | Nodes | Edges | Edges | Classes | Features | Label Rate |
| (view 1) | (view 2) | ||||||
| Product Adoption | Phone logs (West Africa) | 17,000 | 23,032 | 18,371 | 2 | 132 | 0.002 |
| Poverty Prediction | Phone logs (East Africa) | 422 | 544 | 1,799 | 2 | 1,709 | 0.094 |
| Gender Prediction | Phone logs (South Asia) | 958 | 992 | 978 | 2 | 821 | 0.042 |
| Citeseer | Citation network | 3,327 | 4,732 | 3,492 | 6 | 3,703 | 0.036 |
| Cora | Citation network | 2,708 | 5,429 | 2,846 | 7 | 1,433 | 0.052 |
| Method | Product Adoption | Poverty Prediction | Gender Prediction |
|---|---|---|---|
| DeepWalk (first view) | 56.430.187 | 51.910.62 | 53.18 0.55 |
| DeepWalk (second view) | 51.970.112 | 50.340.36 | 50.840.64 |
| DeepWalk (view union) | 56.81 0.114 | 50.870.95 | 52.340.50 |
| Node2vec (first view) | 53.870.20 | 52.260.58 | 50.12 0.40 |
| Node2vec (second view) | 50.500.11 | 49.700.23 | 51.680.40 |
| Node2vec (view union) | 54.500.11 | 50.520.63 | 51.640.53 |
| LINE (first view) | 51.110.01 | 50.150.02 | 51.56 0.001 |
| LINE (second view) | 50.830.01 | 52.290.001 | 50.000.001 |
| LINE (view union) | 56.260.003 | 50.180.001 | 51.330.002 |
| GCN (first view) | 70.742.2 | 55.192.33 | 63.97 1.29 |
| GCN (second view) | 71.401.81 | 50.060.81 | 63.010.013 |
| GCN (view union) | 71.900.9 | 50.220.56 | 63.901.32 |
| Multi-GCN (this paper) | 73.470.91 | 59.230.20 | 66.34 1.03 |
| Predefined train-test splits | ||
|---|---|---|
| Method | Citeseer | Cora |
| ManiReg (first view) - ? (?) | 60.1 | 59.5 |
| DeepWalk (first view) - ? (?) | 43.2 | 67.2 |
| Planetoid (first view) - ? (?) | 64.7 | 75.7 |
| GCN (first view) | 70.3 | 81.5 |
| GCN (second view) | 50.7 | 53.6 |
| GCN (view union) | 70.7 | 80.4 |
| Multi-GCN (this paper) | 71.3 | 82.5 |
| Randomized train-test splits | ||
| GCN (first view) | 67.9 0.5 | 80.10.5 |
| GCN (second view) | 53.60.1 | 56.90.3 |
| GCN (view union) | 67.90.3 | 78.50.1 |
| Multi-GCN (this paper) | 70.5 0.2 | 81.10.2 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Multi-GCN: Graph Convolutional Networks for Multi-View Networks,
with Applications to Global Poverty
Muhammad Raza Khan, Joshua E. Blumenstock
University of California, Berkeley
[email protected], [email protected]
Abstract
With the rapid expansion of mobile phone networks in developing countries, large-scale graph machine learning has gained sudden relevance in the study of global poverty. Recent applications range from humanitarian response and poverty estimation to urban planning and epidemic containment. Yet the vast majority of computational tools and algorithms used in these applications do not account for the multi-view nature of social networks: people are related in myriad ways, but most graph learning models treat relations as binary. In this paper, we develop a graph-based convolutional network for learning on multi-view networks. We show that this method outperforms state-of-the-art semi-supervised learning algorithms on three different prediction tasks using mobile phone datasets from three different developing countries. We also show that, while designed specifically for use in poverty research, the algorithm also outperforms existing benchmarks on a broader set of learning tasks on multi-view networks, including node labelling in citation networks.
1 Introduction
Over the past several years, large-scale graph machine learning has gained increasing relevance in the domain of international poverty research (?). Driven largely by the expansion of mobile phone networks throughout developing countries – roughly 95% of the world population now has mobile phone coverage (?) – vast quantities of network data are constantly being generated by people living in even extremely poor and marginalized communities. Recent work has shown how such data can be used to inform critical policy decisions, including the measurement of living conditions (?), the spread of infectious diseases (?), and the management of humanitarian crises (?). Private companies are also taking advantage of this new source of data, for instance by using data from mobile phones to generate credit scores that can expand credit to millions of people historically shut out of the formal banking ecosystem (?).
However, a critical constraint to the use of these data in settings related to economic development is the lack of scalable algorithms for performing prediction tasks on sparse multi-view networks. Multi-view networks (also referred to as multiplex and multi-modal networks), are networks in which nodes can be related in multiple ways, and are the natural abstraction for mobile phone networks, where different individuals have different types of relationships and can interact using different modalities (such as phone calls, text messages, money transfers, and app-based activity). Yet, the vast majority of applied research using mobile phone data — in developing and developed countries alike — ignores the multi-view nature of phone networks.
This paper develops a novel approach for learning on multi-view networks, which bridges two different strands in the research literature. The first strand involves methods for efficient analysis of multi-view networks; the second explores algorithms for semi-supervised graph learning (see Related Work, below). The method we develop provides an efficient approach for applying convolutional neural networks to multi-view graph-structured data. We benchmark this new method, which we call Multi-GCN (short for Multi-View Graph Convolutional Networks), on three different mobile network datasets, on three different prediction tasks relevant to the international development community: (1) predicting the adoption of a new “financial inclusion” technology in a West African country; (2) predicting whether an individual is living below the poverty line in an East African country; (3) predicting the gender of mobile phone subscribers in a South Asian country. In all cases, we find that Multi-GCN outperforms state-of-the-art benchmarks, including standard Graph Convolutional Networks (?), Node2Vec (?), Deepwalk (?), and LINE (?).
While designed specifically with the developing-country context in mind (where the sparsity and multi-view properties of networks are very salient), we show that Multi-GCN can be more generally applied to a wide range of problems involving multi-view networks. Indeed, most real-world networks are multi-view, including the network data most frequently used by AI researchers (e.g., data from Twitter, Amazon, Netflix, etc.). Our second set of results shows that Multi-GCN can improve upon state-of-the-art algorithms not just in poverty-related contexts, but also in traditional classification problems. In particular, we show that Multi-GCN outperforms competing algorithms on citation labeling tasks (using benchmark datasets from Citeseer and Cora) that have been studied extensively in prior work.
2 Related Work
2.1 Technical Related Work
Our goal is to develop an efficient method for node-level transductive semi-supervised learning over multi-view graphs. Here, we begin with a general overview of semi-supervised learning, then focus on various approaches to graph-based semi-supervised learning, and finally discuss related work on multi-view networks.
Graph-Based Semi-Supervised Learning
One of the biggest issue with applying supervised learning algorithms in a developing country is that it is often costly to collect labels for training. For instance, when using mobile phone data to predict the wealth of subscribers, ? (?) manually conducted a survey of roughly 1,000 subscribers. Semi-supervised learning tries to solve this problem by using unlabeled data along with the labeled data to train better classifiers (see (?) for a survey). Our focus is on transductive semi-supervised learning, which assumes that all the unlabeled data is available at the training time and does not attempt to generalize to data unseen during training.
Graph-based semi-supervised learning (GSSL) is a popular approach for semi-supervised learning that treats labeled and unlabeled instances as graph vertices, and relationships between instances as edges (?). GSSL algorithms try to learn a classifier that is consistent with the labeled data while making sure that the prediction for similar nodes is also similar. This is achieved by minimizing a loss function with two factors: a) supervised loss over the labeled instances, and b) a graph-based regularization term. Different GSSL algorithms use different functions for graph regularization. Label propagation-based approaches, for instance, use a constrained label lookup function (e.g., ? (?)). Related, kernel-based approaches parameterize regularization term in the Reproducing Kernel Hilbert Space (RKHS).
Learning Over Graphs
The success of word embedding algorithms like Word2Vec (?) has inspired similar algorithms for graphs. For instance, DeepWalk (?) learns embeddings by predicting the neighborhood of nodes based on random walks over the graphs, while LINE (?) and Node2vec (?) allow for advanced sampling schemes. More recently, neural network-based approaches have been proposed to perform learning over graphs. These have been extended to the task of semi-supervised learning (?; ?), including recent work by ? (?) that proposes a Graph Convolutional Network (GCN), which we take as a starting point for our approach.
Learning Over Multi-View Graphs
The key distinction between our approach and prior work is our desire to handle graphs with multiple views, i.e., graphs where vertices can be connected in more than one way. In recent years, many different algorithms have been proposed for learning on multi-view graphs. These algorithms can be broadly divided into three main categories: 1) co-training algorithms, 2) learning with multiple kernels, and 3) subspace learning (See ? (?) for a survey). Recent work by ? (?) show that subspace approaches — which find a latent subspace shared by multiple views — perform well relative to co-training and kernelized approaches on a range of tasks. We therefore focus our attention on integrating subspace learning approaches with recent innovations in graph convolutional networks.
Comparison with existing work
Our main contribution is to propose an efficient method for adapting GSSL to multi-view contexts. Existing approaches to GSSL cannot be readily implemented on such data; those algorithms that do handle multiple views generally treat views and vertices equally. We show that current “state of the art” methods like Graph Convolutional Networks (?) can be enhanced by augmenting the input graph using subspace analysis over Grassman manifolds. ? (?) have demonstrated that subspace merging approach can be quite accurate for the problem of cross-domain recommendation which is different from our experimental settings and context as described in the section 4.
2.2 Empirical Related Work
Our experimental results focus on three prediction tasks of relevance to the international development community:
Predicting poverty.
A large number of humanitarian applications — from poverty targeting to program monitoring — require accurate estimates of the welfare for beneficiary populations. Recently, several papers have shown how digital trace data can be used to estimate the socioeconomic status of individuals, households, and villages. For instance, ? (?) show that daytime satellite imagery can be used to estimate village wealth; ? (?) find that Twitter data can be used to estimate levels of deprivation, and ? (2015) shows that mobile phone metadata can be used to estimate the welfare of individuals and regions.
Product adoption.
We focus on the adoption of “mobile money”, a suite of phone-based financial services that are designed to promote financial inclusion among those traditionally shut out of the formal banking ecosystem (?). Within this literature, our work relates most closely to ? (2016), who analyze the predictors of mobile money adoption in three different developing countries.
Gender prediction.
Gender equality and women’s empowerment are one of the Sustainable Development Goals, and recent work explores how digital trace data can be used to assess progress toward this goal (?). ? (?) and ? (?) show that gender can be predicted from social media and mobile phone data.
Broadly, these prior studies demonstrate a proof of concept: that digital trace data can be used to predict the characteristics and outcomes of individuals. However, such analysis rely on off-the-shelf algorithms that rarely, if ever, account for the multi-view nature of real-world social networks. This paper shows that a simple approach to multi-view learning can yield substantial improvements on these real-world prediction tasks.
3 Multi-GCN: Multi-View Graph Convolutional Networks
Our approach to semi-supervised learning on multi-view graphs integrates three steps, depicted in Figure 1. First, we use methods from subspace analysis to efficiently merge multiple views of the same graph. Second, we use a manifold ranking procedure to identify the most informative sub-components of the graph and to prune the graph upon which learning is performed. Finally, we apply a convolutional neural network, adapted to graph-structured data, to allow for semi-supervised node classification.
3.1 Merging Subspace Representations
Given an undirected multilayer graph with M layers such that each layer has the same vertex set but same or different edges set , we first calculate the graph Laplacian for each of the individual layers. If and represent the degree matrix and the adjacency matrix for the view of the graph, then the normalized graph Laplacian is defined as
[TABLE]
Given the graph Laplacian for each layer of the graph, we calculate the spectral embedding matrix through trace minimization:
[TABLE]
This trace minimization problem can be solved by the Rayleigh-Ritz theorem. The solution contains the first eigenvectors corresponding to the smallest eigenvalues of . The spectral embedding embeds nodes of the original graph to a low dimensional spectral domain (See ? (?) for details).
A Grassman manifold can be considered as a set of -dimensional linear subspaces in where each unique subspace is mapped to a unique point on the manifold. Each point on the manifold can be represented by an orthonormal matrix whose columns span the corresponding k-dimensional subspace in and the distance between the subspaces can be calculated as a set of principal angles between these subspaces. ? (?) show that the projection distance between two subspaces and can be represented as a separate trace minimization problem:
[TABLE]
where, based on Eq. 3, the projection distance between the target representative subspace and the individual subspaces can be calculated as:
[TABLE]
Minimization of Eq. 4 ensures that individual subspaces are close to the final representative subspace .
Finally, to ensure that the original vertex connectivity in each graph layer is preserved, we include a separate term that minimizes the quadratic-form Laplacian (evaluated on the columns of U):
[TABLE]
In Eq 5, is the regularization parameter that balances the trade-off between the two terms in the objective function. Rearranging Eq. 5 and ignoring the constant terms yields
[TABLE]
As before, the Rayleigh-Ritz theorem can be used to solve Eq 5. The solution is given by the fist eigenvectors of the modified Laplacian:
[TABLE]
3.2 Graph-Based Manifold Ranking
Though the modified Laplacian calculated above can be fed directly to the downstream graph convolutional networks, model performance can be increased by ranking the nodes in the manifold based on their saliency with respect to some critical nodes (?). To rank points on the manifold, we use the closed form function,
[TABLE]
Here, represents the identity matrix, is the normalized Laplacian as calculated in Eq. 7, and is the regularization parameter. Given a vector containing the indices of the query nodes, Eq. 8 calculates the saliency of the other nodes with respect to the query nodes; the saliency of these nodes can then be used to add or prune edges from the induced underlying graph. The use of manifold-based ranking suits our approach as the modified Laplacian representing merged subspaces can be used directly for saliency detection. The query nodes can be selected as the centroids determined by any clustering algorithm over the manifold.
The algorithm for the subspace merging and subsequent manifold ranking is shown in Algorithm 1. The time complexity of Algorithm 1 for a graph with layers with users per layer is where represents the number of eigenvectors to be calculated and is the number of centroids is the cost of computing Laplacians and Eigenvector matrix for all the layers ; is the cost of computing modified Laplacian; is the cost of computing clusters using k-means clustering; is the cost of manifold ranking. using the iterative version described by (?).
3.3 Graph Convolution Networks
The application of convolutional neural networks to irregular or non-Euclidean grids, such as graphs, is based on the fact that convolutions are multiplications in the Fourier domain, which implies that graph convolutions can be expressed as the multiplication of a signal with a filter (see ? (?)):
[TABLE]
Here, represents the eigen-decomposition of the normalized graph Laplacian and , , represent the identity, degree and the adjacency matrix, respectively. Graph convolutions can be further expressed in terms of Chebyshev polynomials as
[TABLE]
where is the rescaled Laplacian, represents the Chebyshev polynomials, and represents the vector of Chebyshev coefficients. Following ? (?), by approximating the maximum value of the largest eigenvalue and constraining the number of free parameters, the convolution operation can be represented as
[TABLE]
where and are the renormalized versions of and . This renormalization avoids numerical instabilities resulting from exploding/vanishing gradients (?).
The modified graph ( in Algorithm 1) resulting from the merger of Laplacians using the subspace analysis and manifold ranking can be fed directly into the graph convolution networks defined above. The forward propagation model for a two layer network can then be represented as
[TABLE]
Here, is calculated as a preprocessing step before giving the input to the neural network. and represent the input-to-hidden-layer and hidden-layer-to-output weight matrices for a two layer neural network, and can be trained using gradient descent. ReLU and Softmax represent the activation functions in the hidden and output layers.
4 Experiments and Data
4.1 Datasets
Our first set of experiments test Multi-GCN on three prediction tasks relevant to international development. Each one uses a different dataset of mobile phone Call Detail Records (CDR), obtained from three different developing countries with GDP per capita less than $1,600 USD. These datasets contain detailed metadata on all communication events (calls, messages) that occur on the mobile phone network. Each CDR dataset contains multiple possible relationships between nodes (views); we extract one view corresponding to phone calls between users, and another corresponding to text messages. We separately construct a large set of features of each user (such as total call volume and degree centrality), using the combinatoric approach described in ? (?).
Table 1 presents summary statistics for each of these datasets. The connections and sparsity of each network are shown in Figure 2. These spy plots help visualize the structure of the adjacency matrices for each graph view, where a dot indicates that an edge exists between those two individuals on the corresponding view.
Product adoption dataset
The first dataset that we use is a sample of a dataset of mobile phone activity from a West African country. Here, the classification of interest is whether or not the user eventually adopts a new financial inclusion product. There are two possible classifications: (1) Did not adopt; (2) Adopted and used the product. Following the experimental setup described in ? (?), we randomly selected 20 users from each category (40 total) for the training dataset; the validation and the testing dataset consist of 500 and 1000 randomly selected users, respectively.
Poverty prediction dataset
The wealth prediction dataset consists of several thousand transactions of different mobile phone users from an East African country. We attempt to classify users as poor or non-poor, where labels were obtained by ? (?) through a small set of phone surveys that were conducted with mobile phone subscribers. Again, we randomly selected 20 users from each category as the training dataset, while the size of the validation dataset and the testing dataset is 100 and 200 respectively.
Gender prediction dataset
The gender prediction dataset originates from a developing country in South Asia. Here, the classification task is to predict the gender of the mobile phone users, where gender labels are provided by the operator for a small number of labeled instances. We randomly select 20 users from each category for training; the size of the validation and the testing datasets are 100 and 800, respectively.
Citation classification datasets
A final set of experiments replicates the experimental design of ? (?) to test Multi-GCN on more standard node labelling tasks. In these datasets, nodes are documents and the first view corresponds to the citation links between the research papers. We construct the second view from the textual similarity of the papers. Specifically, if the normalized cosine similarity between documents is greater than 0.8, then we create an edge in the second view of the citation network.
4.2 Experimental setup
In general, our goal is to correctly classify nodes in a network, where only a very small fraction of nodes are labeled. In the experiments, we start from a small sample of labeled nodes and test the ability of Multi-GCN, as well as several state-of-the-art algorithms, to correctly classify unlabeled nodes in the validation and testing sets. We use three popular node embedding algorithms (Node2vec, Deepwalk, and LINE) as a first set of baselines. In addition, we provide three baselines based on graph convolutional networks (?). The first two, GCN (first view) and GCN (second view), apply GCN over the two respective adjacency matrices from phone and text message activity. The third, GCN (view union), operates on the union of the adjacency matrices of the first view and the second view. In each GCN baseline, the node features are constructed from the adjacency matrix of the first view.
After merging different views, we rank the interaction between nodes using Eq. 8 based on their salience with respect to the query points. The value of the regularization parameter (see Eq. 7) is selected through 10-fold cross-validation. We similarly tune the hyper-parameters to 0.99 and set the number of query points to ten times the number of classes.
After adding salient edges and eliminating non-salient edges through the ranking process, both the adjacency matrix of the modified graph and the node features are passed as input to a two-layer graph convolutional network as described in Section 3. All of the GCN-based models, including Multi-GCN, are trained for a maximum of 200 iterations, using Adam (Adaptive moment estimation extension to stochastic gradient descent – see ? (?)) and a learning rate of 0.01. Other GCN hyper-parameters are set using the same values reported in ? (?).
5 Results
Experimental results for the three developing-country datasets are shown in Table 2. Each row in this table indicates the average and standard error of the classification accuracy over 10 randomly drawn train-test splits of the same size for each dataset, constructed as described in Section 4. The last row in Table 2 shows the performance of Multi-GCN. In all four datasets, Multi-GCN outperforms existing state-of-the-art benchmarks, with the margin of improvement greatest in the poverty prediction task and smallest in the gender prediction task.
The second set of experimental results, comparing Multi-GCN to recent benchmarks on a more standard node classification task, are shown in Table 3. In addition to performing a comparison over randomly drawn train-test splits, we also compare the performance of Multi-GCN against a different set of randomized test-train splits, as used in the original tests by ? (?), with an additional validation set of 500 instances used for hyper-parameter tuning. In all cases, we observe improvements in predictive accuracy of Multi-GCN relative to existing approaches.
6 Discussion
This paper proposes a new approach to semi-supervised learning on multi-view graphs. Through a series of experiments, we show that this approach improves upon state-of-the-art embedding- and convolution-based algorithms on a variety of prediction tasks related to both poverty research and to node labelling in general.
Relative to single-view learning algorithms, the main value of the multi-GCN approach is that it incorporates non-redundant information from multiple views into the learning process. Thus, the gains from multi-GCN depend on the prediction task, and the importance of multi-view graph structure to that task. Intuitively, this depends on the mutual information between. This intuition is also supported by a closer look at the results in Table 2. Here, we observe that while Multi-GCN provides the biggest gains relative to Deepwalk, Node2vec and LINE in the case of product adoption, the gains relative to single-view GCN are more modest. By contrast, the performance gain on the poverty and gender prediction tasks is significantly higher for Multi-GCN, even relative to the other single-view GCN benchmarks. The spy plots in Figures 2(a)-2(c) help explain this pattern. In particular, we can see that different views in the product adoption setting appear somewhat redundant, whereas for poverty and gender prediction the views appear more independent.
We believe future work should explore several limitations of the current analysis. In particular, there is much to be learned from a more systematic exploration of the value of additional views, and for different methods for merging views (beyond the subspace learning approach developed in Section 3.1). We are also exploring how graphs with varying degrees of sparsity and a different fraction of labeled nodes can impact the performance of Multi-GCN relative to alternative approaches.
7 Conclusion
Graph convolutional networks have recently achieved considerable success in a variety of learning tasks on irregular, graph-structured data. Leveraging insights from spectral graph theory, GCN’s are beginning to replicate the success that CNN’s have seen on more regular image and text data. For a wide variety of learning tasks relevant to graph-structured data — in contexts ranging from advertising in online networks to intervening in the spread of a contagious disease — this is a promising development.
In this paper, we have shown that state-of-the-art GCNs can achieve even greater performance on a variety of classification tasks when the multi-view nature of the underlying network is incorporated into the learning process. While motivated by three applications in global poverty research, the performance gains appear to generalize to other graph-based classification problems. We therefore view Multi-GCN as an important first step in adapting neural network-based approaches to multi-view networks and hope that it provides a foundation for future work in this space.
8 Acknowledgements
This research was supported by the National Science Foundation Grant under award #CCF - 1637360 (Algorithms in the Field) and by the Office of Naval Research (Minerva Initiative) under award N00014-17-1-2313.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[2015] Blumenstock, J.; Cadamuro, G.; and On, R. 2015. Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076.
- 2[2014] Blumenstock, J. E. 2014. Calling for Better Measurement: Estimating an Individual’s Wealth and Well-Being from Mobile Phone Transaction Records. In The 20th ACM Conference on Knowledge Discovery and Mining (KDD ’14), Workshop on Data Science for Social Good .
- 3[2016] Blumenstock, J. E. 2016. Fighting poverty with data. Science 353(6301):753–754.
- 4[2013] Bruna, J.; Zaremba, W.; Szlam, A.; and Le Cun, Y. 2013. Spectral networks and locally connected networks on graphs. ar Xiv preprint ar Xiv:1312.6203 .
- 5[2016] Defferrard, M.; Bresson, X.; and Vandergheynst, P. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems , 3844–3852.
- 6[2014] Dong, X.; Frossard, P.; Vandergheynst, P.; and Nefedov, N. 2014. Clustering on multi-layer graphs via subspace analysis on grassmann manifolds. IEEE Transactions on signal processing 62(4):905–918.
- 7[2017] Farseev, A.; Samborskii, I.; Filchenkov, A.; and Chua, T.-S. 2017. Cross-domain recommendation via clustering on multi-layer graphs. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval , 195–204. ACM.
- 8[2018] Fatehkia, M.; Kashyap, R.; and Weber, I. 2018. Using facebook ad data to track the global digital gender gap. World Development 107:189–209.
