Malware Detection based on API Calls: A Reproducibility Study

Juhani Merilehto

arXiv:2601.08725·cs.CR·January 14, 2026

Malware Detection based on API Calls: A Reproducibility Study

Juhani Merilehto

PDF

Open Access

TL;DR

This paper reproduces and validates a malware detection method based on API call frequency analysis, confirming its effectiveness and robustness with consistent results across multiple experiments.

Contribution

It independently reproduces the original API call-based malware detection approach, confirming its high accuracy, reproducibility, and practical viability.

Findings

01

F1-scores exceeded original results by up to 2.57%.

02

High reproducibility with standard deviations below 0.5%.

03

Unigram model proved effective as a lightweight detector.

Abstract

This study independently reproduces the malware detection methodology presented by Felli cious et al. [7], which employs order-invariant API call frequency analysis using Random Forest classification. We utilized the original public dataset (250,533 training samples, 83,511 test samples) and replicated four model variants: Unigram, Bigram, Trigram, and Combined n gram approaches. Our reproduction successfully validated all key findings, achieving F1-scores that exceeded the original results by 0.99% to 2.57% across all models at the optimal API call length of 2,500. The Unigram model achieved F1=0.8717 (original: 0.8631), confirming its ef fectiveness as a lightweight malware detector. Across three independent experimental runs with different random seeds, we observed remarkably consistent results with standard deviations be low 0.5%, demonstrating high reproducibility. This study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Software Engineering Research