DetectBERT: Towards Full App-Level Representation Learning to Detect Android Malware
Tiezhu Sun, Nadia Daoudi, Kisub Kim, Kevin Allix, Tegawend\'e F., Bissyand\'e, Jacques Klein

TL;DR
DetectBERT combines a BERT-like model with multiple instance learning to improve Android malware detection at the app level, capturing complex behaviors beyond static analysis methods.
Contribution
It introduces a novel framework that integrates DexBERT with correlated Multiple Instance Learning for effective app-level Android malware detection.
Findings
DetectBERT outperforms existing state-of-the-art methods.
It effectively handles high dimensionality and variability of malware features.
The framework shows adaptability to evolving malware threats.
Abstract
Recent advancements in ML and DL have significantly improved Android malware detection, yet many methodologies still rely on basic static analysis, bytecode, or function call graphs that often fail to capture complex malicious behaviors. DexBERT, a pre-trained BERT-like model tailored for Android representation learning, enriches class-level representations by analyzing Smali code extracted from APKs. However, its functionality is constrained by its inability to process multiple Smali classes simultaneously. This paper introduces DetectBERT, which integrates correlated Multiple Instance Learning (c-MIL) with DexBERT to handle the high dimensionality and variability of Android malware, enabling effective app-level detection. By treating class-level features as instances within MIL bags, DetectBERT aggregates these into a comprehensive app-level representation. Our evaluation demonstrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Mobile and Web Applications
