Open Graph Benchmark: Datasets for Machine Learning on Graphs

Weihua Hu; Matthias Fey; Marinka Zitnik; Yuxiao Dong; Hongyu Ren,; Bowen Liu; Michele Catasta; Jure Leskovec

arXiv:2005.00687·cs.LG·February 26, 2021·491 cites

Open Graph Benchmark: Datasets for Machine Learning on Graphs

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren,, Bowen Liu, Michele Catasta, Jure Leskovec

PDF

Open Access 5 Repos 1 Video

TL;DR

The Open Graph Benchmark (OGB) offers a comprehensive set of large-scale, diverse, and realistic graph datasets with standardized evaluation protocols to advance scalable and robust graph machine learning research.

Contribution

OGB introduces a unified benchmark suite with diverse datasets, evaluation protocols, and an automated pipeline, facilitating reproducible and scalable graph ML research.

Findings

01

OGB datasets are challenging for current models.

02

Significant scalability and generalization issues identified.

03

Benchmark results highlight areas for future improvement.

Abstract

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Open Graph Benchmark: Datasets for Machine Learning on Graphs· slideslive

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Machine Learning in Healthcare