Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware
Brandon J Walton, Mst Eshita Khatun, James M Ghawaly, Aisha Ali-Gombe

TL;DR
This paper investigates using GPT-4o-mini-based LLMs with strategic prompt engineering to enhance Android malware analysis, enabling faster identification, categorization, and pinpointing malicious code snippets without fine-tuning.
Contribution
It introduces sp, a novel LLM-based framework that improves malware categorization and summarization for Android apps through hierarchical prompts and backward tracing techniques.
Findings
Achieves up to 77% classification accuracy without fine-tuning.
Provides detailed summaries at multiple levels of malware analysis.
Enables pinpointing of malicious code snippets via backward tracing.
Abstract
Malware analysis is a complex process of examining and evaluating malicious software's functionality, origin, and potential impact. This arduous process typically involves dissecting the software to understand its components, infection vector, propagation mechanism, and payload. Over the years, deep reverse engineering of malware has become increasingly tedious, mainly due to modern malicious codebases' fast evolution and sophistication. Essentially, analysts are tasked with identifying the elusive needle in the haystack within the complexities of zero-day malware, all while under tight time constraints. Thus, in this paper, we explore leveraging Large Language Models (LLMs) for semantic malware analysis to expedite the analysis of known and novel samples. Built on GPT-4o-mini model, \msp is designed to augment malware analysis for Android through a hierarchical-tiered summarization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Cybercrime and Law Enforcement Studies · Network Security and Intrusion Detection
