Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning
Hamza Harkous, Kassem Fawaz, R\'emi Lebret, Florian Schaub, Kang G., Shin, Karl Aberer

TL;DR
Polisis is an automated, scalable framework leveraging deep learning to analyze, classify, and answer questions about privacy policies, improving accessibility and understanding for users, regulators, and companies.
Contribution
It introduces a novel privacy-centric language model and neural classifier hierarchy for detailed privacy policy analysis and supports structured and free-form queries.
Findings
Achieved 88.4% accuracy in privacy icon assignment
PriBot provides correct answers in top-3 for 82% of questions
User study shows 89% relevance in PriBot's top-3 answers
Abstract
Privacy policies are the primary channel through which companies inform users about their data collection and sharing practices. These policies are often long and difficult to comprehend. Short notices based on information extracted from privacy policies have been shown to be useful but face a significant scalability hurdle, given the number of policies and their evolution over time. Companies, users, researchers, and regulators still lack usable and scalable tools to cope with the breadth and depth of privacy policies. To address these hurdles, we propose an automated framework for privacy policy analysis (Polisis). It enables scalable, dynamic, and multi-dimensional queries on natural language privacy policies. At the core of Polisis is a privacy-centric language model, built with 130K privacy policies, and a novel hierarchy of neural-network classifiers that accounts for both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Internet Traffic Analysis and Secure E-voting
