Static Attribution of Android Residential Proxy Malware Using Graph Kernels
Peter Clark, Yong Guan, Zhonghao Liao

TL;DR
This paper introduces a static-analysis method using graph kernels to accurately attribute Android proxy malware to specific families, enhancing detection and explainability.
Contribution
It presents a novel static-analysis pipeline leveraging graph kernels and behavioral signatures for proxy malware family attribution with high accuracy.
Findings
SGD classifier achieved macro F1 of 0.985
Yara rules enabled up to 88.45% family accuracy
Over half of proxy apps still contain embedded SDK code
Abstract
Android residential proxy applications represent a growing class of potentially-unwanted programs (PUPs) that covertly route third-party traffic through end-user devices, enabling ad fraud, credential abuse, and evasion of geolocation controls by sophisticated threat actors. Attributing an unknown APK to a specific proxy network remains challenging due to code reuse, SDK embedding, and obfuscation across proxy families. We present a static-analysis pipeline for automated proxyware family attribution, extracting graph-structured representations (control-flow and function-call graphs) and behavioral signatures from a labeled corpus of 3,365 Android proxy apps spanning four commercial proxy networks. We evaluate Weisfeiler-Lehman graph kernel features alone and fused with binary capability vectors across multiple classifiers. Using 5-fold DEX-grouped cross-validation to prevent data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
