GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

Jie Peng; Jiarui Ji; Runlin Lei; Zhewei Wei; Yongchao Liu; Chuntao Hong

arXiv:2507.03267·cs.AI·February 19, 2026

GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

Jie Peng, Jiarui Ji, Runlin Lei, Zhewei Wei, Yongchao Liu, Chuntao Hong

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces GDGB, a comprehensive benchmark with high-quality datasets and tasks for generative dynamic text-attributed graphs, enabling rigorous evaluation of models that generate complex, evolving graph data with rich textual information.

Contribution

It provides the first standardized benchmark with curated datasets, novel generation tasks, and evaluation metrics for generative modeling of dynamic text-attributed graphs.

Findings

01

GDGB datasets improve textual quality over prior datasets

02

Proposed tasks TDGG and IDGG facilitate structured evaluation of generative models

03

Experimental results highlight the importance of structural and textual features in generation quality

Abstract

Dynamic Text-Attributed Graphs (DyTAGs), which intricately integrate structural, temporal, and textual attributes, are crucial for modeling complex real-world systems. However, most existing DyTAG datasets exhibit poor textual quality, which severely limits their utility for generative DyTAG tasks requiring semantically rich inputs. Additionally, prior work mainly focuses on discriminative tasks on DyTAGs, resulting in a lack of standardized task formulations and evaluation protocols tailored for DyTAG generation. To address these critical issues, we propose Generative DyTAG Benchmark (GDGB), which comprises eight meticulously curated DyTAG datasets with high-quality textual features for both nodes and edges, overcoming limitations of prior datasets. Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 5

Strengths

1. The paper is well written and easy to follow. 2. The authors introduce several dynamic text-attributed graph datasets with higher text quality than existing benchmarks. 3. The authors design an LLM-based multi-agent generative framework; an accompanying illustration would improve comprehensibility.

Weaknesses

### **Major Concern** 1. **Claim of the “first” generative DyTAG benchmark.** The paper claims to be the first generative DyTAG benchmark, yet it does not evaluate **existing TAG-generation or DyG-generation methods**. Instead, it varies only the LLM backbones for DyTAG generation. Without comparisons to prior generative baselines, the “first” or “state-of-the-art” claim is overstated. Please include representative TAG/DyG generative methods (or strong non-LLM baselines) and report side-by-s

Reviewer 02Rating 8Confidence 3

Strengths

1. This paper is well written and easy to follow. 2. The paper proposes the 1st dedicated generative DyTAG benchmark with rich, realistic node/edge texts across eight diverse domains. 3. The TDGG and IDGG tasks is well defined, and the metrics (structure, text, embedding) give a comprehensive quality picture.

Weaknesses

1. IDGG new-node evaluation may lack direct semantic-drift or human-amenity checks.

Reviewer 03Rating 4Confidence 3

Strengths

a) There is a clear gap in standardized datasets for textual attributed graphs, which this paper attempts to fill. b) Novel Eight text attributed dynamic graph datasets proposed c) Comprehensive comparison against existing datasets in terms of richness and utility of these textual attributes to highlight their importance d) The paper clearly motivates the problem and reports a detailed analysis of these datasets.

Weaknesses

A) The paper somewhat dilutes its core contribution by introducing a Generative framework: GAG-General, which is adopted from existing work. Core contribution could only be a textual attributed benchmark with evaluation metrics. And this generative framework could be proposed as a baseline method, presenting it as a central contribution, overemphasizes an incremental component, as novelty here is very limited. B) Text quality and graph embedding metrics: There seems to be a dependency on the u

Code & Models

Repositories

ji-cather/graphagent
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques