The Needle is a Thread: Finding Planted Paths in Noisy Process Trees
Maya Le, Pawe{\l} Pra{\l}at, Aaron Smith, Fran\c{c}ois Th\'eberge

TL;DR
This paper introduces the 'planted path' problem and presents an algorithm for finding fuzzy matchings in trees, aiding cybersecurity data analysis by identifying meaningful sequences within noisy log data.
Contribution
It proposes a novel algorithm for detecting planted paths in noisy trees, applicable as a building block for complex cybersecurity data workflows.
Findings
Effective in mining synthetic data
Useful in real-world cybersecurity datasets
Demonstrates practical applicability
Abstract
Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Business Process Modeling and Analysis · Data Mining Algorithms and Applications
