Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic, Ivan Brugere, Saumitra Mishra, Yuying Xie, Jiliang Tang

TL;DR
This paper introduces UniAug, a diffusion model-based universal graph structure augmentor that enhances data scaling and improves performance across diverse graph-based tasks by adaptively augmenting graph data.
Contribution
We develop UniAug, a novel diffusion model-based graph augmentor that enables effective data scaling across heterogeneous graphs for improved downstream task performance.
Findings
Consistent performance improvements across multiple graph tasks.
First demonstration of a cross-domain graph structure augmentor.
Effective adaptive augmentation via pre-trained diffusion models.
Abstract
Models for natural language and images benefit from data scaling behavior: the more data fed into the model, the better they perform. This 'better with more' phenomenon enables the effectiveness of large-scale pre-training on vast amounts of data. However, current graph pre-training methods struggle to scale up data due to heterogeneity across graphs. To achieve effective data scaling, we aim to develop a general model that is able to capture diverse data patterns of graphs and can be utilized to adaptively help the downstream tasks. To this end, we propose UniAug, a universal graph structure augmentor built on a diffusion model. We first pre-train a discrete diffusion model on thousands of graphs across domains to learn the graph structural patterns. In the downstream phase, we provide adaptive enhancement by conducting graph structure augmentation with the help of the pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Data Mining Algorithms and Applications
MethodsDiffusion
