Longitudinal Compliance Analysis of Android Applications with Privacy Policies
Saad Sajid Hashmi, Nazar Waheed, Gioacchino Tangari, Muhammad, Ikram, Stephen Smith

TL;DR
This study longitudinally analyzes 268 Android apps from 2008 to 2016, revealing increasing non-compliance between privacy policies and actual data practices, highlighting privacy transparency issues over time.
Contribution
It introduces a machine learning-based method to detect discrepancies between privacy policies and app behaviors across multiple versions.
Findings
Increase in undisclosed data collection practices over time
Newer app versions tend to be more non-compliant
Privacy policies often do not match actual app behaviors
Abstract
Contemporary mobile applications (apps) are designed to track, use, and share users' data, often without their consent, which results in potential privacy and transparency issues. To investigate whether mobile apps have always been (non-)transparent regarding how they collect information about users, we perform a longitudinal analysis of the historical versions of 268 Android apps. These apps comprise 5,240 app releases or versions between 2008 and 2016. We detect inconsistencies between apps' behaviors and the stated use of data collection in privacy policies to reveal compliance issues. We utilize machine learning techniques for the classification of the privacy policy text to identify the purported practices that collect and/or share users' personal information, such as phone numbers and email addresses. We then uncover the data leaks of an app through static and dynamic analysis.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Green IT and Sustainability · Mobile Health and mHealth Applications
