TL;DR
This paper investigates how link prediction models perform under distribution shifts in graph data, introducing a new splitting strategy to evaluate model robustness and revealing significant performance variability.
Contribution
The paper proposes LPShift, a novel dataset splitting method based on structural properties, to systematically study link predictor generalizability under distribution shifts.
Findings
Model performance varies drastically under LPShift variants.
Graph structure significantly influences generalization success.
Current models are sensitive to distribution changes.
Abstract
State-of-the-art link prediction (LP) models demonstrate impressive benchmark results. However, popular benchmark datasets often assume that training, validation, and testing samples are representative of the overall dataset distribution. In real-world situations, this assumption is often incorrect; uncontrolled factors lead new dataset samples to come from a different distribution than training samples. Additionally, the majority of recent work with graph dataset shift focuses on node- and graph-level tasks, largely ignoring link-level tasks. To bridge this gap, we introduce a novel splitting strategy, known as LPShift, which utilizes structural properties to induce a controlled distribution shift. We verify LPShift's effect through empirical evaluation of SOTA LP models on 16 LPShift variants of original dataset splits, with results indicating drastic changes to model performance.…
Peer Reviews
Decision·Submitted to ICLR 2025
1. The focus on distribution shifts in link prediction is valuable, as models often fail to generalize well when such shifts occur in real-world settings, like social networks or recommender systems. This paper directly addresses this issue by benchmarking models in varied shift scenarios. 2. The paper provides comprehensive baseline comparisons under the distribution shift setting.
1. The motivation behind distribution shifts in link prediction is not effectively illustrated in the introduction. A clearer example is needed to demonstrate why link prediction models might struggle with distribution shifts: Figure 1 is confusing. Models should easily predict links with more common neighbors (blue and green links) compared to red links, aligning with typical link prediction assumptions. However, this is not explained well, which weakens the introductory motivation. 2. The d
1. This paper tackles a compelling and timely research problem: handling complex distribution shifts in graph data for link prediction tasks, a topic of significant recent interest. 2. The splitting strategy is straightforward and well-explained, which is easy to understand. 3. The observations are inspiring to show the author’s motivations.
1. The novelty is somewhat concerning to me since there is no comprehensive approach proposed. 2. The paper lacks detailed theoretical analyses to support their main claims. 3. I think this paper seems to be a technical report/benchmark paper instead of a research paper. The authors might address this concern by presenting a formal method rather than intuitive explanations to answer the presented questions.
The paper proposes a method to introduce distribution shift in link prediction. By evaluating various methods across multiple benchmarks, it demonstrates that existing models are vulnerable to distribution shifts in link prediction. Through several analyses, the authors aim to show the effectiveness of the proposed method for inducing distribution shift and to explain why there is a drop in model performance.
1. The most critical issue is the lack of clarity on why distribution shift is necessary in link prediction and what realistic scenarios justify this need. Can’t we consider changes in edge connections as changes at the graph level? How does graph-level distribution shift differ from this? It would be beneficial to provide specific scenarios that demonstrate the importance of this problem. Without this, it might seem like the issue is being created without a clear purpose. 2. It’s unclear wheth
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
