Large Language Models: A New Approach for Privacy Policy Analysis at Scale

David Rodriguez; Ian Yang; Jose M. Del Alamo; Norman Sadeh

arXiv:2405.20900·cs.CL·December 22, 2025·1 cites

Large Language Models: A New Approach for Privacy Policy Analysis at Scale

David Rodriguez, Ian Yang, Jose M. Del Alamo, Norman Sadeh

PDF

Open Access

TL;DR

This paper demonstrates that Large Language Models like ChatGPT and Llama 2 can effectively automate privacy policy analysis at scale, outperforming traditional NLP methods in accuracy, cost, and efficiency.

Contribution

It introduces a novel application of LLMs for privacy policy analysis, providing guidance on prompt design and validating performance with benchmark datasets.

Findings

01

F1 score exceeds 93% on benchmark datasets

02

Reduces costs and processing times compared to traditional methods

03

Requires less technical expertise for implementation

Abstract

The number and dynamic nature of web and mobile applications presents significant challenges for assessing their compliance with data protection laws. In this context, symbolic and statistical Natural Language Processing (NLP) techniques have been employed for the automated analysis of these systems' privacy policies. However, these techniques typically require labor-intensive and potentially error-prone manually annotated datasets for training and validation. This research proposes the application of Large Language Models (LLMs) as an alternative for effectively and efficiently extracting privacy practices from privacy policies at scale. Particularly, we leverage well-known LLMs such as ChatGPT and Llama 2, and offer guidance on the optimal design of prompts, parameters, and models, incorporating advanced strategies such as few-shot learning. We further illustrate its capability to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy, Security, and Data Protection · Privacy-Preserving Technologies in Data

MethodsLLaMA