EagleMine: Vision-Guided Mining in Large Graphs
Wenjie Feng, Shenghua Liu, Christos Faloutsos, Bryan Hooi, Huawei, Shen, Xueqi Cheng

TL;DR
EagleMine is a vision-inspired algorithm that identifies micro-cluster patterns and anomalies in large graphs by analyzing node feature distributions through a multi-resolution water-level tree approach.
Contribution
The paper introduces EagleMine, a novel vision-guided method for detecting micro-clusters and anomalies in large graphs using a water-level tree and statistical hypothesis testing.
Findings
EagleMine effectively finds overlapping elliptical clusters in large graph data.
The method accurately detects bots and anomalous users in real Microblog data.
EagleMine outperforms baseline methods in cluster detection and anomaly identification.
Abstract
Given a graph with millions of nodes, what patterns exist in the distributions of node characteristics, and how can we detect them and separate anomalous nodes in a way similar to human vision? In this paper, we propose a vision-guided algorithm, EagleMine, to summarize micro-cluster patterns in two-dimensional histogram plots constructed from node features in a large graph. EagleMine utilizes a water-level tree to capture cluster structures according to vision-based intuition at multi-resolutions. EagleMine traverses the water-level tree from the root and adopts statistical hypothesis tests to determine the optimal clusters that should be fitted along the path, and summarizes each cluster with a truncated Gaussian distribution. Experiments on real data show that our method can find truncated and overlapped elliptical clusters, even when some baseline methods split one visual cluster…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Complex Network Analysis Techniques · Data-Driven Disease Surveillance
