On the Transferability of Spectral Graph Filters
Ron Levie, Elvin Isufi, Gitta Kutyniok

TL;DR
This paper demonstrates that spectral graph filters can be stable and transferable across different graphs by introducing the Cayley smoothness space, challenging the misconception that spectral filters lack stability.
Contribution
The paper introduces the Cayley smoothness space, proving spectral filters within it are linearly stable and transferable across graphs, advancing understanding of spectral filter generalization.
Findings
Spectral graph filters in the Cayley smoothness space are linearly stable.
Filters in this space can approximate any generic spectral filter.
Spectral filters are transferable due to stability and equivariance.
Abstract
This paper focuses on spectral filters on graphs, namely filters defined as elementwise multiplication in the frequency domain of a graph. In many graph signal processing settings, it is important to transfer a filter from one graph to another. One example is in graph convolutional neural networks (ConvNets), where the dataset consists of signals defined on many different graphs, and the learned filters should generalize to signals on new graphs, not present in the training set. A necessary condition for transferability (the ability to transfer filters) is stability. Namely, given a graph filter, if we add a small perturbation to the graph, then the filter on the perturbed graph is a small perturbation of the original filter. It is a common misconception that spectral filters are not stable, and this paper aims at debunking this mistake. We introduce a space of filters, called the…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the Transferability of Spectral Graph Filters
Ron Levie
Elvin Isufi
Gitta Kutyniok
Abstract
This paper focuses on spectral filters on graphs, namely filters defined as elementwise multiplication in the frequency domain of a graph. In many graph signal processing settings, it is important to transfer a filter from one graph to another. One example is in graph convolutional neural networks (ConvNets), where the dataset consists of signals defined on many different graphs, and the learned filters should generalize to signals on new graphs, not present in the training set. A necessary condition for transferability (the ability to transfer filters) is stability. Namely, given a graph filter, if we add a small perturbation to the graph, then the filter on the perturbed graph is a small perturbation of the original filter. It is a common misconception that spectral filters are not stable, and this paper aims at debunking this mistake. We introduce a space of filters, called the Cayley smoothness space, that contains the filters of state-of-the-art spectral filtering methods, and whose filters can approximate any generic spectral filter. For filters in this space, the perturbation in the filter is bounded by a constant times the perturbation in the graph, and filters in the Cayley smoothness space are thus termed linearly stable. By combining stability with the known property of equivariance, we prove that graph spectral filters are transferable.
1 Introduction
The success of convolutional neural networks (ConvNets) on Euclidean domains ignited an interest in recent years in extending these methods to graph structured data. A graph ConvNet is a mapping, that receives a signal defined over the vertices of a graph, and returns a value in some output space. A graph ConvNet consists of many layers of computations, where each layer computes a set of filters of the output of the previous layer, followed by a pointwise nonlinearity, and optionally a pooling step and non-convolution layers. In a machine learning setting, the general architecture of the ConvNet is fixed, but the specific filters to use in each layer are free parameters. In training, the filter coefficients are optimized to minimize some loss function. In some situations, both the graph and the signal defined on the graph are variables in the input space of the ConvNet. In these situations, if two graphs represent the same underlying phenomenon, and the two signals given on the two graphs are similar in some sense, the output of the ConvNet on both signals should be similar as well. This property is typically termed transferability, and is an essential requirement if we wish the ConvNet to generalize well on the test set. Analyzing and proving transferability is the subject of this paper.
A necessary condition of any reasonable definition of transferability is stability. Namely, given a filter, if the topology of a graph is perturbed, then the filter on the perturbed graph is close to the filter on the un-perturbed graph. Without stability it is not even possible to transfer a filter from a graph to another very close graph, and thus stability is necessary for transferability. Previous work studied the behavior of graph filters with respect variations in the graph. [1] provided numerical results on the robustness of polynomial graph filters to additive Gaussian perturbations of the eigenvectors of the graph Laplacian. Since the eigendecomposition is not stable to perturbations in the topology of the graph, this result does not prove robustness to such perturbations. [2] showed that the expected graph filter under random edge losses is equal to the accurate output. However, [2] did not bound the error in the output in terms of the error in the graph topology. In this paper we show the linear stability of graph filters to general perturbations in the topology.
There are generally two approaches to defining convolution on graphs, both generalizing the standard convolution on Euclidean domains [3, 4]. Spatial approaches generalize the idea of a sliding window to graphs. Here, the main challenge is to define a way to translate a filter kernel along the vertices of the graph. Some popular examples of spatial methods are [5, 6, 7]. Spectral methods are inspired by the convolution theorem in Euclidean domains, that states that convolution in the spatial domain is equivalent to pointwise multiplication in the frequency domain. The challenge here is to define the frequency domain and the Fourier transform of graphs. The basic idea is to define the graph Laplacian, or some other graph operator that we interpreted as a shift operator, and to use its eigenvalues as frequencies and its eigenvectors as the corresponding pure harmonics [8]. Decomposing a graph signal to its pure harmonic coefficients is by definition the graph Fourier transform, and filters are defined by multiplying the different frequency components by different values. For some examples of spectral methods see, e.g., [9, 10, 11, 12]. Additional references for both methods can be found in [4].
The great majority of researchers from the graph ConvNet community currently focus on developing spatial methods. One typical motivation for favoring spatial methods is the claim that spectral methods are not transferable, and thus do not generalize well on graphs unseen in the training set. The goal in this paper is to debunk this misconception, and to show that state-of-the-art spectral graph filtering methods are transferable. This paper does not argue against spatial methods, but shows the potential of spectral approaches to cope with datasets having varying graphs. We would like to encourage researches to reconsider spectral methods in such situations. Interestingly, [13] obtained near state-of-the-art results in face recognition using spectral filters on variable graphs, without any modification to compensate for the “non-transferability”.
2 Preliminaries
2.1 Spectral graph filters
Consider an undirected weighted graph , with vertices , edges , and adjacency matrix . The adjacency matrix is symmetric and represents the weights of the edges, where is nonzero only if vertex is connected to vertex . Consider the degree matrix , defined as the diagonal matrix with entries .
The frequency domain of a graph is determined by choosing a shift operator, namely a self-adjoint operator that respects the connectivity of the graph. As a prototypical example, we consider the unnormalized Laplacian , which depends linearly on . Other examples of common shift operators are the normalized Laplacian , and the adjacency matrix itself. In this paper we call a generic self-adjoint shift operator Laplacian, and denote it by . Denote the eigenvalues of by , and the eigenvectors by . The Fourier transform of a graph signal is given by the vector of frequency intensities
[TABLE]
where is an inner product in , e.g., the standard dot product. The inverse Fourier transform of the vector is given by
[TABLE]
Since is an orthonormal basis, is the inverse of . A spectral graph filter based on the coefficients is defined by
[TABLE]
Any spectral filter defined by (1) is equivariant, namely, does not depend on the indexing of the vertices. Re-indexing the vertices in the input, results in the same re-indexing of vertices in the output.
Spectral filters defined by (1) have two disadvantages. First, as shown in Subsection 2.2, they are not transferable. Second, they entail high computational complexity. Formula (1) requires the computation of the eigendecomposition of the Laplacin , which is computationally demanding and can be unstable when the number of vertices is large. Moreover, there is no general “graph FFT” algorithm for computing the Fourier transform of a signal , and (1) requires computing the frequency components and their summation directly.
To overcome these two limitations, state-of-the-art methods, like [10, 14, 11, 12], are based on functional calculus. Functional calculus is the theory of applying functions on normal operators in Hilbert spaces. In the special case of a self-adjoint or unitary operator with a discrete spectrum, is defined by
[TABLE]
for any vector in the Hilbert space, where is the eigendecomposition of the operator . The operator is normal for general , self-adjoint for , and unitary for (where is the unit complex circle).
Definition (2) is canonical in the following sense. In the special case where
[TABLE]
is a rational function, can be defined in two ways. First, by (2), and second by compositions, linear combinations, and inversions, as
[TABLE]
It can be shown that (2) and (3) are equivalent. Moreover, definition (2) is also canonical in regard to non-rational functions. Loosly speaking, if a rational function approximates the function , then the operator approximates the operator .
Implementation (3) overcomes the limitation of definition (1), where now filters are defined via (2) with polynomial or rational function . By relying on the spatial operations of compositions, linear combinations, and inversions, the computation of a spectral filter is carried out entirely in the spatial domain, without ever resorting to spectral computations. Thus, no eigendecomposition and Fourier transforms are ever computed. The inversions in involve solving systems of linear equations, which can be computed directly if is small, or by some iterative approximation method for large . Methods like [10, 15, 8, 12] use polynomial filters, and [14, 11, 16] use rational function filters. We term spectral methods based on functional calculus functional calculus filters.
2.2 The misconception of non-transferability of spectral graph filters
The non-transferability claim is formulated based on the sensitivity of the Laplacian eigendecomposition to small perturbation in , or equivalently in . Namely, a small perturbation of can result in a large perturbation of the eigendecomposition , which results in a large change in the filter defined via (1). This argument, while true, does not prove non-transferability, since state-of-the-art spectral methods do not explicitly use the eigenvectors, and do not parametrize the filter coefficients via the index of the eigenvalues. Instead, state-of-the-art methods are based on functional calculus, and define the filter coefficients using a function , as . The parametrization of the filter coefficients by is indifferent to the specifics of how the spectrum is indexed, and instead represents an overall response in the frequency domain, where the value of each frequency determines its response, and not its index. In functional calculus filters defined by (2), a small perturbation of that results in a perturbation of , also results in a perturbation of the coefficients . It turns out, as we prove in this paper, that the perturbation in implicitly compensates for the instability of the eigendecomposition, and functional calculus spectral filters are stable.
3 Main results
3.1 Transferability of functional calculus filters
In this paper, we define transferability as the linear robustness of the filter to re-indexing of the vertices and perturbation of the topology of the graph. Thus, to formulate transferability, we combine equivariance with stability. Since spectral filters are known to be equivariant, transferability is equivalent to stability. Thus, our goal is to prove stability.
3.2 Linear stability of spectral filters
Stability is proven on a dense subspace of filters is , which we term the Cayley smoothness space. The definition of the Cayley smoothness space is based on the Cayley transform , defined by .
Definition 1**.**
The Cayley smoothness space is the subspace of functions of the form g(\lambda)=q\big{(}\mathcal{C}(\lambda)\big{)}, where is in , and has classical Fourier coefficients satisfying .
The mapping is a seminorm. It is not difficult to show that is dense in each space with . Intuitively, Cayley smoothness implies decay of the filter kernel in the spatial domain, since it models smoothness in a frequency domain. This can be formulated rigorously for graph filters based on Cayley polynomials (g(\lambda)=q\big{(}\mathcal{C}(\lambda)\big{)} with finite expansion ) [11, Theorem 4].
For filters in the Cayley smoothness space we can obtain a linear rate of convergence, which is our main contribution.
Theorem 2**.**
Let be a self-adjoint matrix that we call Laplacian. Let be self-adjoint, such that . Let . Then
[TABLE]
4 Examples
ChebNets. Consider the normalized translated Laplacian . In ChebNets [10], is a polynomial, and since the spectrum of is in , the values of outside do not affect the filter . Thus, we may assume that is a polynomial in , and padded outside to obtain a smooth compactly supported function. It is easy to see that such a is in . Thus, for two translated normalized Laplacians and of two graphs, .
General rational functions. The above claim is also true for general rational functions, if we assume that the spectrum of is contained in some pre-defined band . Thus, the polynomial filters of [15, 12] and the ARMA rational function filters of [14, 16] are also transferable, under the assumption of uniformly bounded Laplacians.
CayleyNets. CayleyNets [11] are always transferable, since a Cayley filter is by definition in , with finite expansion.
To corroborate the proposed theoretical result, in Figure 1 we test the above three examples in the Molene weather dataset111Access to the raw data is possible from https://donneespubliques.meteofrance.fr/donneeslibres/Hackathon/RADOMEH.tar.gz. The graph comprises weather stations, with weights given as the Gaussian of the physical distances between stations. Each of the graph signals is a temperature recording. For the polynomial filter we consider the normalized Laplacian, while for the Cayley and ARMA filters we consider the unnormalized Laplacian. The results are averaged over different perturbations in the topology and the graph signals. The experimental results concord with the theoretical linear stability property.
5 Proof of Theorem 2
We start with a useful lemma.
Lemma 3**.**
Suppose are self-adjoint matrices satisfying , and for some . Then for every
[TABLE]
Proof.
Let .
[TABLE]
so
[TABLE]
Now, (5) follows by repeatedly using (7) with decreasing powers , . ∎
Next, we cite a general property from spectral theory.
Lemma 4**.**
Let be a bounded normal operator in a Hilbert space. Let be the spectrum of . Define the infinity norm on the space of bounded continuous functions by
[TABLE]
Then
[TABLE]
where the norm in the left-hand-side is the operator norm.
To prove Theorem 2, we start with a version the theorem restricted to where has a finite expansion with coefficients .
Proof of Theorem 2 for finite Cayley expansions.
Note that
[TABLE]
so
[TABLE]
[TABLE]
By the fact that the spectrum of is real, , and we have
[TABLE]
Let us bound in terms of . Since we may expand
[TABLE]
so, by ,
[TABLE]
[TABLE]
Observe that and are unitary, so their spectrum is bounded by . Thus, by Lemma 3 and the triangle inequality on the polynomial expansion of p\big{(}\mathcal{C}(\boldsymbol{\Delta})\big{)}-p\big{(}\mathcal{C}(\boldsymbol{\Delta}^{\prime})\big{)},
[TABLE]
which gives (4). ∎
Proof of Theorem 2.
Theorem 2 follows the above result by a simple density argument. Given , we consider the truncations , where is restricted to the coefficients . We base a three-epsilon argument on the expansion, for any ,
[TABLE]
For any , by Lemma 4, the first and the last terms of the right-hand side of 11 can be made smaller than by choosing large enough. Moreover, for any ,
[TABLE]
so, by Theorem 2 for finite Cayley expansions, for every
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Segarra, A. G. Marques, and A. Ribeiro, “Optimal graph-filter design and applications to distributed linear network operators,” IEEE Transactions on Signal Processing , vol. 65, no. 15, pp. 4117–4131, 2017.
- 2[2] E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Filtering random graph processes over random time-varying graphs,” IEEE Transactions on Signal Processing , vol. 65, no. EPFL-ARTICLE-230521, pp. 4406–4421, 2017.
- 3[3] M. M. Bronstein, J. Bruna, Y. Le Cun, A. Szlam, and P. Vandergheynst, “Geometric deep learning: Going beyond euclidean data,” IEEE Signal Processing Magazine , vol. 34, no. 4, pp. 18–42, July 2017.
- 4[4] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, “A comprehensive survey on graph neural networks,” ar Xiv preprint ar Xiv:1901.00596 , 2019.
- 5[5] M. Gori, G. Monfardini, and F. Scarselli, “A new model for learning in graph domains,” in Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. , vol. 2, July 2005, pp. 729–734 vol. 2.
- 6[6] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE Transactions on Neural Networks , vol. 20, no. 1, pp. 61–80, Jan 2009.
- 7[7] F. Monti, D. Boscaini, J. Masci, E. Rodolà, J. Svoboda, and M. M. Bronstein, “Geometric deep learning on graphs and manifolds using mixture model cnns,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 5425–5434, 2017.
- 8[8] A. Ortega, P. Frossard, J. Kovačević, J. M. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges, and applications,” Proceedings of the IEEE , vol. 106, no. 5, pp. 808–828, 2018.
