DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Zifeng Ding; Yifeng Li; Yuan He; Antonio Norelli; Jingcheng Wu; Volker Tresp; Michael Bronstein; Yunpu Ma

arXiv:2408.04713·cs.LG·June 9, 2025

DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Zifeng Ding, Yifeng Li, Yuan He, Antonio Norelli, Jingcheng Wu, Volker Tresp, Michael Bronstein, Yunpu Ma

PDF

3 Reviews

TL;DR

DyGMamba is a novel model that efficiently captures long-term temporal dependencies in continuous-time dynamic graphs using state space models, achieving state-of-the-art results while maintaining computational efficiency.

Contribution

It introduces DyGMamba, a new approach combining node-level and time-level state space models for effective and efficient long-term dynamic graph representation learning.

Findings

01

Achieves state-of-the-art dynamic link prediction performance.

02

Maintains high efficiency with limited computational resources.

03

Effectively captures long-term temporal dependencies.

Abstract

Learning useful representations for continuous-time dynamic graphs (CTDGs) is challenging, due to the concurrent need to span long node interaction histories and grasp nuanced temporal details. In particular, two problems emerge: (1) Encoding longer histories requires more computational resources, making it crucial for CTDG models to maintain low computational complexity to ensure efficiency; (2) Meanwhile, more powerful models are needed to identify and select the most critical temporal information within the extended context provided by longer histories. To address these problems, we propose a CTDG representation learning model named DyGMamba, originating from the popular Mamba state space model (SSM). DyGMamba first leverages a node-level SSM to encode the sequence of historical node interactions. Another time-level SSM is then employed to exploit the temporal patterns hidden in the…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

1. The research problem is interesting, and the link prediction is a pragmatic and important application. 2. The study has the potential to solve the dynamic representation learning and applications from the Mamba perspective. 3. The paper's organization is not hard to follow.

Weaknesses

1. [Minor] The preliminary section is not informative. In a better way, the preliminary should pave the way for illustrating the proposed method. So far, the illustration of Mamba is not clear, and the connection of the illustration with the proposed method is not well aligned. Also, the dimension in Eq (1a) seems inconsistent for the matrix computation. 2. For Eq. (4a) the computation procedure for $Broadcast\_{4d}$ is missing. Also, in Eq (4d), the computation procedure for $SSM\_{A,B,C}$ is

Reviewer 02Rating 3Confidence 5

Strengths

1. The experiments are extensive.

Weaknesses

1. The novelty of this work is low. Some important modules are highly similar to existing works. In specific: - In Section 3.1, The "Encode Neighbor Features" and " Patching Neighbors" sections are highly similar to DyGFormer [1]. - The main network architecture is taken from original Mamba [2], without incorporating structure information of dynamic graph (see Eq. 5, 6 and 7). 2. The authors claim that applying Mamba for dynamic graph learning is to address the challenge of computati

Reviewer 03Rating 8Confidence 5

Strengths

This is arguably the first Mamba model for dynamic graph representation learning, and I think the long context modeling ability of Mamba is suitable for temporal graph learning. I appreciate the designs of two types of SSM blocks that consider both node-level and edge-level information, both of which encode critical information about temporal patterns. The proposed DyGMamba has a satisfactory performance by outperforming baseline methods with higher accuracy in link prediction, shorter training

Weaknesses

The overall design of DyGMamba makes sense to me, as it basically employs SSM block for node features and edge features. I have a few questions regarding the experiments in this paper. * All datasets used in this paper do not have node features. Based on TGAT, I assume the authors are using all-zero vectors as node features. I wonder how DyGMamba would perform when there are node features, e.g., on GDELT dataset. * Meanwhile, ablation study in Tables 3 and 4 are interesting. As you shift from

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces