AndroZoo++: Collecting Millions of Android Apps and Their Metadata for the Research Community
Li Li, Jun Gao, M\'ed\'eric Hurier, Pingfan Kong, Tegawend\'e F., Bissyand\'e, Alexandre Bartel, Jacques Klein, and Yves Le Traon

TL;DR
AndroZoo++ is a comprehensive, growing dataset of over five million Android apps and their metadata, designed to support and advance research in Android security, analysis, and related fields.
Contribution
It introduces a large, continuously updated dataset of Android apps and metadata, facilitating reproducible research and enabling new Android app analysis studies.
Findings
Over five million apps collected
Includes 20 types of metadata such as VirusTotal reports
Supports reproducible Android research
Abstract
We present a growing collection of Android apps collected from several sources, including the official Google Play app market and a growing collection of various metadata of those collected apps aiming at facilitating the Android-relevant research works. Our dataset by far has collected over five million apps and over 20 types of metadata such as VirusTotal reports. Our objective of collecting this dataset is to contribute to ongoing research efforts, as well as to enable new potential research topics on Android Apps. By releasing our app and metadata set to the research community, we also aim at encouraging our fellow researchers to engage in reproducible experiments. This article will be continuously updated based on the growing apps and metadata collected in the AndroZoo project. If you have specific metadata that you want to collect from AndroZoo and which are not yet provided by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Software Testing and Debugging Techniques · Web Data Mining and Analysis
