Graph as Point Set
Xiyuan Wang, Pan Li, Muhan Zhang

TL;DR
This paper proposes a novel graph-to-set conversion method that enables the use of set encoders, including Transformers, for graph representation learning, leading to improved expressivity and performance over traditional GNNs.
Contribution
It introduces a bijective graph-to-set transformation and a new set encoder approach, including Point Set Transformer, enhancing graph learning capabilities and theoretical expressivity.
Findings
PST outperforms existing GNNs on substructure and shortest path tasks.
The set encoder achieves comparable performance to GNNs.
The method enables lossless graph information injection into Transformers.
Abstract
Graph is a fundamental data structure to model interconnections between entities. Set, on the contrary, stores independent elements. To learn graph representations, current Graph Neural Networks (GNNs) primarily use message passing to encode the interconnections. In contrast, this paper introduces a novel graph-to-set conversion method that bijectively transforms interconnected nodes into a set of independent points and then uses a set encoder to learn the graph representation. This conversion method holds dual significance. Firstly, it enables using set encoders to learn from graphs, thereby significantly expanding the design space of GNNs. Secondly, for Transformer, a specific set encoder, we provide a novel and principled approach to inject graph information losslessly, different from all the heuristic structural/positional encoding methods adopted in previous graph transformers. To…
Peer Reviews
Decision·ICML 2024 Poster
- A clear presentation of the methodology; easy to follow - The design of the architecture is well motivated by the analysis of symmetric rank decomposition, and this paper further provides theoretical analysis of expressiveness - The proposed model shows good empirical performance
- This paper exaggerates its contributions: this paper claims that the proposed model is a "paradigm shift", but the use of symmetric rank decomposition is actually not much different from existing positional encodings based on eigendecomposition. Also, the invariance to orthogonal transformations of positional encoding is discussed and addressed by several related papers (e.g., [1]). I don't think the proposed model is significantly different. - Given the above point, the experiment part should
1. The authors provide theoretical foundations for their approach, demonstrating that isomorphic graphs can be perfectly represented using their method. The use of Symmetric Rank Decomposition is well-explained and supported by theorems. They also provide expressivity results regarding the short-range and the long-range abilities of the models. 2. The proposed method seems to outperform recent graph transformer architectures in the datasets used in this study.
1. One notable weakness in the paper is the tendency to overstate the novelty of certain ideas without providing adequate justification. Specifically, the paper emphasizes the concept of transforming a graph into a set of independent nodes as a novel approach. While the paper introduces a unique method of achieving this transformation through Symmetric Rank Decomposition (SRD), it does not sufficiently acknowledge that the idea of treating graphs as sets of nodes is not entirely novel in the con
- [S1] The perspective of viewing graphs as point sets and performing graph representation learning via a set encoder is very interesting and original, and has great potential across the graph learning community. - [S2] The idea is fairly simple and easy-to-follow, yet empirically effective as shown in the presented experimental results.
- [W1] There are a few details missing in the methodology/experiments that may help to clarify towards better reproducibility. - For each experiment, how is the rank $r$ chosen? Is it chosen via hyperparameter tuning? For larger graphs, it seems choosing a small $r$ will result in loss of information on the connectivity of the input graph. How is it that PST still performs well on Long Range Graph Benchmark despite this potential loss of information? - For the experiments, which matrix was u
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Computational Geometry and Mesh Generation
MethodsAttention Is All You Need · Sparse Evolutionary Training · Dense Connections · Dropout · Label Smoothing · Residual Connection · Set Transformer · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding
