Data Augmentation for Graph Classification

Jiajun Zhou; Jie Shen; Qi Xuan

arXiv:2009.09863·cs.SI·September 22, 2020

Data Augmentation for Graph Classification

Jiajun Zhou, Jie Shen, Qi Xuan

PDF

TL;DR

This paper introduces graph data augmentation techniques and a model evolution framework to improve graph classification accuracy on small datasets, reducing overfitting and enhancing model performance.

Contribution

It presents two heuristic algorithms for graph augmentation and a generic framework, M-Evolve, for iterative model improvement on limited data.

Findings

01

M-Evolve improves accuracy by 3-12% on benchmark datasets.

02

Graph augmentation reduces overfitting in small-scale datasets.

03

The approach is effective across multiple graph classification tasks.

Abstract

Graph classification, which aims to identify the category labels of graphs, plays a significant role in drug classification, toxicity detection, protein analysis etc. However, the limitation of scale of benchmark datasets makes it easy for graph classification models to fall into over-fitting and undergeneralization. Towards this, we introduce data augmentation on graphs and present two heuristic algorithms: random mapping and motif-similarity mapping, to generate more weakly labeled data for small-scale benchmark datasets via heuristic modification of graph structures. Furthermore, we propose a generic model evolution framework, M-Evolve, which combines graph augmentation, data filtration and model retraining to optimize pre-trained graph classifiers. Experiments conducted on six benchmark datasets demonstrate that M-Evolve helps existing graph classification models alleviate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.