Towards Automated Self-Supervised Learning for Truly Unsupervised Graph Anomaly Detection
Zhong Li, Yuhang Wang, Matthijs van Leeuwen

TL;DR
This paper investigates the impact of SSL strategy choices and hyperparameter tuning on graph anomaly detection performance, highlighting issues with label leakage and proposing an internal evaluation method for hyperparameter selection.
Contribution
It introduces an internal evaluation strategy with theoretical backing to select hyperparameters in SSL-based unsupervised graph anomaly detection, addressing label leakage problems.
Findings
Hyperparameter tuning significantly affects detection performance.
Using label information for hyperparameter selection causes performance overestimation.
The proposed internal evaluation method improves hyperparameter selection effectiveness.
Abstract
Self-supervised learning (SSL) is an emerging paradigm that exploits supervisory signals generated from the data itself, and many recent studies have leveraged SSL to conduct graph anomaly detection. However, we empirically found that three important factors can substantially impact detection performance across datasets: 1) the specific SSL strategy employed; 2) the tuning of the strategy's hyperparameters; and 3) the allocation of combination weights when using multiple strategies. Most SSL-based graph anomaly detection methods circumvent these issues by arbitrarily or selectively (i.e., guided by label information) choosing SSL strategies, hyperparameter settings, and combination weights. While an arbitrary choice may lead to subpar performance, using label information in an unsupervised setting is label information leakage and leads to severe overestimation of a method's performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Software System Performance and Reliability · Network Security and Intrusion Detection
