Classification on Large Networks: A Quantitative Bound via Motifs and   Graphons

Andreas Haupt; Mohammad Khatami; Thomas Schultz; Ngoc Mai Tran

arXiv:1710.08878·cs.LG·October 25, 2017·2 cites

Classification on Large Networks: A Quantitative Bound via Motifs and Graphons

Andreas Haupt, Mohammad Khatami, Thomas Schultz, Ngoc Mai Tran

PDF

Open Access

TL;DR

This paper introduces a theoretically grounded, computationally feasible method for classifying large networks using motif homomorphisms and graph spectra, with proven bounds and practical effectiveness.

Contribution

It provides explicit quantitative bounds for motif-based classification of large graphs using graphon theory, connecting motifs, spectra, and statistical guarantees.

Findings

01

Achieves competitive classification results on Lupus Erythematosus data.

02

Provides explicit bounds for motif homomorphism distinguishability.

03

Connects graph spectra to homomorphism densities with theoretical guarantees.

Abstract

When each data point is a large graph, graph statistics such as densities of certain subgraphs (motifs) can be used as feature vectors for machine learning. While intuitive, motif counts are expensive to compute and difficult to work with theoretically. Via graphon theory, we give an explicit quantitative bound for the ability of motif homomorphisms to distinguish large networks under both generative and sampling noise. Furthermore, we give similar bounds for the graph spectrum and connect it to homomorphism densities of cycles. This results in an easily computable classifier on graph data with theoretical performance guarantee. Our method yields competitive results on classification tasks for the autoimmune disease Lupus Erythematosus.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBioinformatics and Genomic Networks · Topological and Geometric Data Analysis · Genomics and Chromatin Dynamics