A new approach to content-based file type detection
M. C. Amirani, M. Toorani, A. A. Beheshti

TL;DR
This paper introduces a novel content-based file type detection method utilizing PCA and neural networks, offering improved accuracy and speed over traditional techniques, with potential applications in security and file management.
Contribution
It presents a new approach combining PCA and neural networks for content-based file type detection and clustering, addressing limitations of classical methods.
Findings
High accuracy in file type detection
Fast processing suitable for real-time applications
Effective clustering of file types
Abstract
File type identification and file type clustering may be difficult tasks that have an increasingly importance in the field of computer and network security. Classical methods of file type detection including considering file extensions and magic bytes can be easily spoofed. Content-based file type detection is a newer way that is taken into account recently. In this paper, a new content-based method for the purpose of file type detection and file type clustering is proposed that is based on the PCA and neural networks. The proposed method has a good accuracy and is fast enough.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
