Predicting Process Name from Network Data

Justin Allen; David Knapp; Kristine Monteith

arXiv:2109.03328·cs.CR·September 9, 2021

Predicting Process Name from Network Data

Justin Allen, David Knapp, Kristine Monteith

PDF

Open Access

TL;DR

This paper presents a machine learning approach that accurately predicts process names from network data, enhancing cyber defense capabilities by identifying applications solely through netflow features.

Contribution

It introduces a novel machine learning method using netflow features to predict process names, demonstrating high accuracy in enterprise network environments.

Findings

01

High classification accuracy achieved with netflow features

02

Effective differentiation between browser and non-browser traffic

03

Successful process name prediction using random forests and neural networks

Abstract

The ability to identify applications based on the network data they generate could be a valuable tool for cyber defense. We report on a machine learning technique capable of using netflow-like features to predict the application that generated the traffic. In our experiments, we used ground-truth labels obtained from host-based sensors deployed in a large enterprise environment; we applied random forests and multilayer perceptrons to the tasks of browser vs. non-browser identification, browser fingerprinting, and process name prediction. For each of these tasks, we demonstrate how machine learning models can achieve high classification accuracy using only netflow-like features as the basis for classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection · Advanced Malware Detection Techniques