More Data Types More Problems: A Temporal Analysis of Complexity, Stability, and Sensitivity in Privacy Policies
Juniper Lovato, Philip Mueller, Parisa Suchdev, Peter S. Dodds

TL;DR
This study analyzes privacy policies from 1997 to 2019 to understand how data collection practices evolve, revealing trends in stability, complexity, and sensitivity related to legislative changes and privacy events.
Contribution
It introduces a novel lexicon of PII data types and provides a mesoscale temporal analysis of privacy policy evolution over two decades.
Findings
Privacy legislation impacts stability and turbulence of PII data types.
Complexity of privacy policies decreases and becomes more regular over time.
Sensitivity of privacy policies increases with legislative and privacy events.
Abstract
Collecting personally identifiable information (PII) on data subjects has become big business. Data brokers and data processors are part of a multi-billion-dollar industry that profits from collecting, buying, and selling consumer data. Yet there is little transparency in the data collection industry which makes it difficult to understand what types of data are being collected, used, and sold, and thus the risk to individual data subjects. In this study, we examine a large textual dataset of privacy policies from 1997-2019 in order to investigate the data collection activities of data brokers and data processors. We also develop an original lexicon of PII-related terms representing PII data types curated from legislative texts. This mesoscale analysis looks at privacy policies overtime on the word, topic, and network levels to understand the stability, complexity, and sensitivity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Privacy-Preserving Technologies in Data
