Flaky Performances when Pretraining on Relational Databases
Shengchao Liu, David Vazquez, Jian Tang, Pierre-Andr\'e No\"el

TL;DR
This paper investigates the challenges of applying self-supervised learning to graph neural networks trained on relational databases, revealing issues with negative transfer and proposing a new contrastive loss, InfoNode, to improve performance.
Contribution
The paper identifies negative transfer issues in SSL for GNNs on RDBs and introduces InfoNode, a novel contrastive loss that enhances mutual information between initial and final node representations.
Findings
Naive contrastive SSL can cause negative transfer.
InfoNode improves GNN performance on relational databases.
Contrastive loss aligning initial and final representations is effective.
Abstract
We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extracted from relational databases (RDBs). Intuitively, this joint use of SSL and GNNs should allow to leverage more of the available data, which could translate to better results. However, we found that naively porting contrastive SSL techniques can cause ``negative transfer'': linear evaluation on fixed representations from a pretrained model performs worse than on representations from the randomly-initialized model. Based on the conjecture that contrastive SSL conflicts with the message passing layers of the GNN, we propose InfoNode: a contrastive loss aiming to maximize the mutual information between a node's initial- and final-layer representation. The primary empirical results support our conjecture and the effectiveness of InfoNode.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Privacy-Preserving Technologies in Data · Recommender Systems and Techniques
MethodsGraph Neural Network
