apk2vec: Semi-supervised multi-view representation learning for profiling Android applications
Annamalai Narayanan, Charlie Soh, Lihui Chen, Yang Liu, Lipo Wang

TL;DR
apk2vec is a semi-supervised multi-view embedding framework that efficiently generates rich app profiles from various semantic views, improving performance in malware detection, clustering, clone detection, and recommendations.
Contribution
It introduces a novel semi-supervised multi-view representation learning framework combining RL and feature hashing for scalable app profiling.
Findings
Outperforms state-of-the-art in malware detection
Enhances app clustering accuracy
Improves clone detection and recommendation results
Abstract
Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Mobile and Web Applications · Web Data Mining and Analysis
