Towards Automatic Comparison of Data Privacy Documents: A Preliminary Experiment on GDPR-like Laws
Kornraphop Kawintiranon, Yaguang Liu

TL;DR
This paper explores an NLP-based method using BERT to automatically compare GDPR-like data privacy laws across different countries, aiming to reduce manual legal analysis effort.
Contribution
It introduces a simple NLP approach leveraging BERT for structured data extraction and similarity measurement of privacy regulations, demonstrating its effectiveness.
Findings
BERT with cosine similarity outperforms baseline models
The approach reduces manual effort in legal document comparison
Data and code are publicly available for further research
Abstract
General Data Protection Regulation (GDPR) becomes a standard law for data protection in many countries. Currently, twelve countries adopt the regulation and establish their GDPR-like regulation. However, to evaluate the differences and similarities of these GDPR-like regulations is time-consuming and needs a lot of manual effort from legal experts. Moreover, GDPR-like regulations from different countries are written in their languages leading to a more difficult task since legal experts who know both languages are essential. In this paper, we investigate a simple natural language processing (NLP) approach to tackle the problem. We first extract chunks of information from GDPR-like documents and form structured data from natural language. Next, we use NLP methods to compare documents to measure their similarity. Finally, we manually label a small set of data to evaluate our approach. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Privacy, Security, and Data Protection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Softmax · Linear Warmup With Linear Decay · Layer Normalization · WordPiece · Attention Dropout · Dropout · Weight Decay
