Adversarial Networks and Machine Learning for File Classification

Ken St. Germain; Josh Angichiodo

arXiv:2301.11964·cs.LG·February 3, 2023·1 cites

Adversarial Networks and Machine Learning for File Classification

Ken St. Germain, Josh Angichiodo

PDF

Open Access

TL;DR

This paper presents a semi-supervised adversarial neural network that accurately classifies file types even when file headers or extensions are concealed, outperforming traditional methods especially with limited labeled data.

Contribution

The authors introduce a novel semi-supervised generative adversarial network for file classification, demonstrating superior accuracy and robustness over existing models in obfuscated scenarios.

Findings

01

Achieved 97.6% accuracy across 11 file types

02

Outperformed traditional neural networks and other machine learning algorithms

03

Effective in scenarios with limited labeled data

Abstract

Correctly identifying the type of file under examination is a critical part of a forensic investigation. The file type alone suggests the embedded content, such as a picture, video, manuscript, spreadsheet, etc. In cases where a system owner might desire to keep their files inaccessible or file type concealed, we propose using an adversarially-trained machine learning neural network to determine a file's true type even if the extension or file header is obfuscated to complicate its discovery. Our semi-supervised generative adversarial network (SGAN) achieved 97.6% accuracy in classifying files across 11 different types. We also compared our network against a traditional standalone neural network and three other machine learning algorithms. The adversarially-trained network proved to be the most precise file classifier especially in scenarios with few supervised samples available. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Digital and Cyber Forensics · Advanced Malware Detection Techniques