AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency Analysis
Jordan Samhi, Tegawend\'e F. Bissyand\'e, Jacques Klein

TL;DR
This paper introduces AndroLibZoo, a comprehensive and automated dataset of Android third-party libraries, addressing the limitations of existing white lists and aiding static analysis for improved accuracy and scalability.
Contribution
It presents a novel automated approach to generate an accurate, up-to-date dataset of Android libraries, significantly expanding and improving upon existing white lists.
Findings
Contains 34,813 libraries, making it the largest dataset of its kind.
Demonstrates the need for reliable library white lists in static analysis.
Provides an evolving dataset to support future research and tools.
Abstract
Android app developers extensively employ code reuse, integrating many third-party libraries into their apps. While such integration is practical for developers, it can be challenging for static analyzers to achieve scalability and precision when libraries account for a large part of the code. As a direct consequence, it is common practice in the literature to consider developer code only during static analysis --with the assumption that the sought issues are in developer code rather than the libraries. However, analysts need to distinguish between library and developer code. Currently, many static analyses rely on white lists of libraries. However, these white lists are unreliable, inaccurate, and largely non-comprehensive. In this paper, we propose a new approach to address the lack of comprehensive and automated solutions for the production of accurate and ``always up to date" sets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile and Web Applications · Advanced Malware Detection Techniques · Green IT and Sustainability
