TL;DR
This paper investigates the problem of over-dilution in Graph Neural Networks, introduces a formal framework for it, and proposes a transformer-based solution to improve node representation quality.
Contribution
It introduces the concept of over-dilution in GNNs, formulates it with dilution factors, and presents a transformer-based method to mitigate this issue.
Findings
Over-dilution significantly affects node information retention.
Transformer-based solutions can alleviate over-dilution effects.
The proposed approach enhances the informativeness of graph representations.
Abstract
Message Passing Neural Networks (MPNNs) hold a key position in machine learning on graphs, but they struggle with unintended behaviors, such as over-smoothing and over-squashing, due to irregular data structures. The observation and formulation of these limitations have become foundational in constructing more informative graph representations. In this paper, we delve into the limitations of MPNNs, focusing on aspects that have previously been overlooked. Our observations reveal that even within a single layer, the information specific to an individual node can become significantly diluted. To delve into this phenomenon in depth, we present the concept of Over-dilution and formulate it with two dilution factors: intra-node dilution for attribute-level and inter-node dilution for node-level representations. We also introduce a transformer-based solution that alleviates over-dilution and…
Peer Reviews
Decision·Submitted to ICLR 2024
S1. Interesting new perspective to study the limitation of MPNNs. S2. Improved performance in tasks like link prediction and node classification.
Please see questions below.
* The paper introduces the concept of over-dilution, a novel perspective in the study of GNNs, particularly MPNNs, that goes beyond the well-studied limitations of over-smoothing, over-squashing. * The proposed transformer-based architecture is not only theoretically grounded but also empirically tested, providing a strong case for its effectiveness in combating the over-dilution problem. This dual approach enhances the credibility of the findings.
* Experiments are not complete. * The story of this paper is weird. I don't know why the author include over-smoothing and over-squashing as a story and don't do any comparison between over-dilution and them.
1. The motivation to adaptively utilize attributes for each node is sound. 2. The analysis about dilution factors and the formal definitions have some merits. 3. The improvements in some datasets are impressive.
1. While the authors conduct the experiments on both link prediction and node classification, they only use three datasets (i.e., computers, photo, and cora ML) for node classification. OGB datasets for node classification are not included. I would like to see some results on ogbn-arxiv or ogbn-product. Even if the model may not perform well on these datasets, I suggest the author provide some analyses or insights about what kind of datasets would benefit more by using the proposed model. 2. To
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
