Are Large Language Models Good Temporal Graph Learners?
Shenyang Huang, Ali Parviz, Emma Kondrup, Zachary Yang, Zifeng Ding, Michael Bronstein, Reihaneh Rabbany, Guillaume Rabusseau

TL;DR
This paper introduces TGTalker, a novel framework that applies Large Language Models to real-world temporal graphs, achieving competitive link prediction results and providing textual explanations, thus advancing temporal graph learning and interpretability.
Contribution
The paper presents TGTalker, the first framework leveraging LLMs for real-world temporal graph learning with explainability, outperforming popular models on multiple datasets.
Findings
TGTalker achieves competitive link prediction accuracy.
TGTalker outperforms models like TGN and HTGN.
It provides textual explanations for predictions.
Abstract
Large Language Models (LLMs) have recently driven significant advancements in Natural Language Processing and various other applications. While a broad range of literature has explored the graph-reasoning capabilities of LLMs, including their use of predictors on graphs, the application of LLMs to dynamic graphs -- real world evolving networks -- remains relatively unexplored. Recent work studies synthetic temporal graphs generated by random graph models, but applying LLMs to real-world temporal graphs remains an open question. To address this gap, we introduce Temporal Graph Talker (TGTalker), a novel temporal graph learning framework designed for LLMs. TGTalker utilizes the recency bias in temporal graphs to extract relevant structural information, converted to natural language for LLMs, while leveraging temporal neighbors as additional information for prediction. TGTalker…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
Although it does not involve any task-specific training, TGTalker achieves performance comparable to state-of-the-art temporal graph neural networks. The paper is clearly written and well organized, making it easy to read and understand.
The paper mainly demonstrates that LLMs can be applied to temporal graphs but offers little theoretical understanding of why this works. It lacks a principled analysis of temporal reasoning, structure encoding, or generalization, so the contribution feels more empirical than conceptual. The approach treats LLM prompting as a black box, missing opportunities to relate findings to established principles like message passing, memory models, or temporal inductive bias. The evaluation lacks rigorou
S1. Novel perspective on using pre-trained LLMs to temporal graph domain. This paper explores their capabilities on this domain and answer fundamental questions. S2. Extensive Evaluation on multiple LLMs S3. Clear paper writing
**W1.** A key weakness in the paper is the unclear positioning of its main contribution. There are 2 parallel goals: to show that LLMs can perform temporal link prediction, and that LLMs can also generate textual explanations of those predictions. The explanations component is largely a by-product of prompting rather than a distinct methodological contribution. The link-prediction contribution itself requires further clarification. A more focused paper at this early stage of exploring LLMs for
1. The authors introduce prompt engineering of the LLMs for the temporal link prediction tasks in a temporal graph, which is easy to follow. 2. The paper is well-written.
1. This paper seems to lack novelty. TGTalker employs a similar prompting paradigm to LLM4DyG, and no learning process has been introduced during the whole pipeline. Moreover, utilizing LLMs cannot be considered a novelty within the temporal graph learning community, as there already exist related works that integrate LLMs into temporal graphs [1][2]. 2. The baselines used in this paper seem to be insufficient. The authors only compare against the recent model TNCN, while the other baselines are
The main strengths of the work are the following: - Strength 1: the paper is well-written, easy to follow and logically structured. Figures effectively illustrate the framework. As a very side note, numbers on Figure 3 may be a bit too small; - Strength 2: the experiments span multiple LLMs and multiple datasets providing a solid empirical grounding; - Strength 3: the work applies LLMs to TG and this is relevant to current trends in combining structured reasoning with language models.
The main weaknesses of this paper are the following ones: Weaknesses 1: while the paper describes the context for TGTalker, there is not a proper discussion on why this specific design (e.g., background set, example set, …) works or how sensitive the results are to prompt formulation. There is also a limited discussion on computational constraints (e.g., token limits, context truncation strategies,...) and related impact Weaknesses 2: concerning the selected baselines, it is not clear whether
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Explainable Artificial Intelligence (XAI)
