Privacy Adhering Machine Un-learning in NLP
Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah, Dan Roth

TL;DR
This paper introduces efficient machine unlearning methods for NLP tasks, enabling rapid data removal from models with minimal performance impact, addressing privacy regulations like GDPR and CCPA.
Contribution
It proposes novel, computationally efficient unlearning approaches (SISA-FC and SISA-A) for NLP models that significantly reduce resource usage while maintaining accuracy.
Findings
Achieved 90-95% memory reduction
Reduced unlearning time by 100x
Maintained model performance after unlearning
Abstract
Regulations introduced by General Data Protection Regulation (GDPR) in the EU or California Consumer Privacy Act (CCPA) in the US have included provisions on the \textit{right to be forgotten} that mandates industry applications to remove data related to an individual from their systems. In several real world industry applications that use Machine Learning to build models on user data, such mandates require significant effort both in terms of data cleansing as well as model retraining while ensuring the models do not deteriorate in prediction quality due to removal of data. As a result, continuous removal of data and model retraining steps do not scale if these applications receive such requests at a very high frequency. Recently, a few researchers proposed the idea of \textit{Machine Unlearning} to tackle this challenge. Despite the significant importance of this task, the area of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Machine Learning in Healthcare
