Predicting Process Name from Network Data
Justin Allen, David Knapp, Kristine Monteith

TL;DR
This paper presents a machine learning approach that accurately predicts process names from network data, enhancing cyber defense capabilities by identifying applications solely through netflow features.
Contribution
It introduces a novel machine learning method using netflow features to predict process names, demonstrating high accuracy in enterprise network environments.
Findings
High classification accuracy achieved with netflow features
Effective differentiation between browser and non-browser traffic
Successful process name prediction using random forests and neural networks
Abstract
The ability to identify applications based on the network data they generate could be a valuable tool for cyber defense. We report on a machine learning technique capable of using netflow-like features to predict the application that generated the traffic. In our experiments, we used ground-truth labels obtained from host-based sensors deployed in a large enterprise environment; we applied random forests and multilayer perceptrons to the tasks of browser vs. non-browser identification, browser fingerprinting, and process name prediction. For each of these tasks, we demonstrate how machine learning models can achieve high classification accuracy using only netflow-like features as the basis for classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection · Advanced Malware Detection Techniques
