Machine Learning-Driven Convergence Analysis in Multijurisdictional Compliance Using BERT and K-Means Clustering
Raj Sonani, Lohalekar Prayas

TL;DR
This paper introduces a machine learning approach using NLP and clustering to analyze and compare privacy regulations like GDPR and CCPA, aiming to improve international compliance strategies.
Contribution
It presents a novel NLP-based method combining BERT and K-Means clustering to identify overlaps and divergences in multijurisdictional privacy laws.
Findings
Identified key regulatory overlaps and differences using NLP techniques.
Proposed methods to improve machine learning model applicability to legal texts.
Enhanced understanding of legal language structure for better compliance analysis.
Abstract
Digital data continues to grow, there has been a shift towards using effective regulatory mechanisms to safeguard personal information. The CCPA of California and the General Data Protection Regulation (GDPR) of the European Union are two of the most important privacy laws. The regulation is intended to safeguard consumer privacy, but it varies greatly in scope, definitions, and methods of enforcement. This paper presents a fresh approach to adaptive compliance, using machine learning and emphasizing natural language processing (NLP) as the primary focus of comparison between the GDPR and CCPA. Using NLP, this study compares various regulations to identify areas where they overlap or diverge. This includes the "right to be forgotten" provision in the GDPR and the "opt-out of sale" provision under CCPA. International companies can learn valuable lessons from this report, as it outlines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
