Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Gang Liu; Eric Inae; Tong Zhao; Jiaxin Xu; Tengfei Luo; Meng Jiang

arXiv:2303.10108·cs.LG·October 13, 2023·6 cites

Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Gang Liu, Eric Inae, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a data-centric approach using diffusion models to leverage unlabeled graphs for property prediction, generating task-specific labeled examples that improve performance over traditional self-supervised methods.

Contribution

The paper proposes a novel diffusion-based method to extract and utilize knowledge from unlabeled graphs by generating labeled graph examples tailored to each prediction task.

Findings

01

Outperforms 15 existing methods on 15 tasks

02

Generated labeled examples improve prediction accuracy

03

Unlabeled data enhances performance beyond self-supervised learning

Abstract

Graph property prediction tasks are important and numerous. While each task offers a small size of labeled examples, unlabeled graphs have been collected from various sources and at a large scale. A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks. However, the self-supervised task knowledge could not be aligned or sometimes conflicted with what the predictions needed. In this paper, we propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points to augment each property prediction model. We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process with each task's labeled data to generate task-specific graph examples and their labels. Experiments demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liugangcode/data_centric_transfer
pytorchOfficial

Videos

Data-Centric Learning from Unlabeled Graphs with Diffusion Model· slideslive

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling · Machine Learning and Data Classification

MethodsDiffusion