Leveraging Large Language Models to Detect npm Malicious Packages
Nusrat Zahan, Philipp Burckhardt, Mikola Lysenko, Feross Aboukhadijeh,, Laurie Williams

TL;DR
This paper evaluates the effectiveness of Large Language Models, specifically GPT-3 and GPT-4, in detecting malicious npm packages, showing significant improvements over static analysis and highlighting cost and accuracy benefits.
Contribution
It introduces SocketAI, an LLM-based workflow for malicious code detection, and provides a comprehensive empirical comparison with static analysis tools on a large npm dataset.
Findings
GPT-4 achieves 99% precision and 97% F1 score.
Pre-screening reduces analysis costs by over 60%.
LLMs outperform static analysis in malicious code detection.
Abstract
Existing malicious code detection techniques demand the integration of multiple tools to detect different malware patterns, often suffering from high misclassification rates. Therefore, malicious code detection techniques could be enhanced by adopting advanced, more automated approaches to achieve high accuracy and a low misclassification rate. The goal of this study is to aid security analysts in detecting malicious packages by empirically studying the effectiveness of Large Language Models (LLMs) in detecting malicious code. We present SocketAI, a malicious code review workflow to detect malicious code. To evaluate the effectiveness of SocketAI, we leverage a benchmark dataset of 5,115 npm packages, of which 2,180 packages have malicious code. We conducted a baseline comparison of GPT-3 and GPT-4 models with the state-of-the-art CodeQL static analysis tool, using 39 custom CodeQL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Information and Cyber Security
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Absolute Position Encodings · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Transformer · Layer Normalization · Multi-Head Attention
