Pattern-Based Graph Classification: Comparison of Quality Measures and Importance of Preprocessing
Lucas Potin, Rosa Figueiredo, Vincent Labatut, Christine Largeron

TL;DR
This paper compares 38 quality measures for pattern-based graph classification, highlighting the importance of preprocessing and revealing that some popular measures are suboptimal for classification tasks.
Contribution
It provides a comprehensive theoretical and empirical comparison of quality measures and introduces a clustering-based preprocessing step to improve classification performance.
Findings
Preprocessing reduces pattern set size without sacrificing accuracy.
Some widely used quality measures are not optimal for classification.
Clustering patterns enhances interpretability and efficiency.
Abstract
Graph classification aims to categorize graphs based on their structural and attribute features, with applications in diverse fields such as social network analysis and bioinformatics. Among the methods proposed to solve this task, those relying on patterns (i.e. subgraphs) provide good explainability, as the patterns used for classification can be directly interpreted. To identify meaningful patterns, a standard approach is to use a quality measure, i.e. a function that evaluates the discriminative power of each pattern. However, the literature provides tens of such measures, making it difficult to select the most appropriate for a given application. Only a handful of surveys try to provide some insight by comparing these measures, and none of them specifically focuses on graphs. This typically results in the systematic use of the most widespread measures, without thorough evaluation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
