Structure-based Anomaly Detection and Clustering
Filippo Leveni

TL;DR
This thesis introduces novel unsupervised structure-based anomaly detection and clustering methods, including Preference Isolation Forest and MultiLink, with applications to streaming data and cybersecurity, outperforming existing techniques.
Contribution
It presents new algorithms like Preference Isolation Forest, MultiLink, and Online-iForest, advancing anomaly detection and clustering in structured, streaming, and cybersecurity data.
Findings
Preference Isolation Forest outperforms existing methods on synthetic and real datasets.
MultiLink effectively recovers multiple geometric model families with robustness and speed.
Online-iForest achieves real-time anomaly detection with accuracy comparable to offline models.
Abstract
Anomaly detection is a fundamental problem in domains such as healthcare, manufacturing, and cybersecurity. This thesis proposes new unsupervised methods for anomaly detection in both structured and streaming data settings. In the first part, we focus on structure-based anomaly detection, where normal data follows low-dimensional manifolds while anomalies deviate from them. We introduce Preference Isolation Forest (PIF), which embeds data into a high-dimensional preference space via manifold fitting, and isolates outliers using two variants: Voronoi-iForest, based on geometric distances, and RuzHash-iForest, leveraging Locality Sensitive Hashing for scalability. We also propose Sliding-PIF, which captures local manifold information for streaming scenarios. Our methods outperform existing techniques on synthetic and real datasets. We extend this to structure-based clustering with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Advanced Malware Detection Techniques
MethodsFocus
