Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild
Shuaike Dong, Menghao Li, Wenrui Diao, Xiangyu Liu, Jian Liu, Zhou Li,, Fenghao Xu, Kai Chen, Xiaofeng Wang, Kehuan Zhang

TL;DR
This study provides a comprehensive large-scale analysis of Android obfuscation techniques in real-world apps, revealing usage patterns and informing better detection and obfuscation strategies.
Contribution
It introduces efficient detection models for four obfuscation techniques and offers large-scale statistical insights into their application in the wild.
Findings
String encryption is more common in malware.
Packed apps are more prevalent on third-party markets.
Obfuscation usage varies significantly across app sources.
Abstract
In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics · Software Testing and Debugging Techniques
