RPG-AE: Neuro-Symbolic Graph Autoencoders with Rare Pattern Mining for Provenance-Based Anomaly Detection
Asif Tauhid, Sidahmed Benabderrahmane, Mohamad Altrabulsi, Ahamed Foisal, Talal Rahwan

TL;DR
This paper introduces a neuro-symbolic graph autoencoder framework enhanced with rare pattern mining to improve the detection of sophisticated cyber threats in system provenance data, demonstrating superior performance over existing methods.
Contribution
It presents a novel unified model combining graph autoencoding with rare pattern mining for enhanced anomaly detection in cybersecurity.
Findings
Outperforms baseline graph autoencoder in anomaly ranking
Achieves competitive results with ensemble methods
Enhances interpretability through pattern mining integration
Abstract
Advanced Persistent Threats (APTs) are sophisticated, long-term cyberattacks that are difficult to detect because they operate stealthily and often blend into normal system behavior. This paper presents a neuro-symbolic anomaly detection framework that combines a Graph Autoencoder (GAE) with rare pattern mining to identify APT-like activities in system-level provenance data. Our approach first constructs a process behavioral graph using k-Nearest Neighbors based on feature similarity, then learns normal relational structure using a Graph Autoencoder. Anomaly candidates are identified through deviations between observed and reconstructed graph structure. To further improve detection, we integrate an rare pattern mining module that discovers infrequent behavioral co-occurrences and uses them to boost anomaly scores for processes exhibiting rare signatures. We evaluate the proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Anomaly Detection Techniques and Applications · Scientific Computing and Data Management
