The Empirical Impact of Data Sanitization on Language Models
Anwesan Pal, Radhika Bhargava, Kyle Hinsz, Jacques Esterhuizen, and, Sudipta Bhattacharya

TL;DR
This paper empirically examines how data sanitization, especially redacting sensitive information, affects language model performance across various NLP tasks, revealing task-dependent impacts and proposing mitigation strategies.
Contribution
It provides a comprehensive analysis of data sanitization effects on language models, highlighting task-specific impacts and introducing methods to mitigate performance degradation.
Findings
Low impact (1-5%) on sentiment analysis and entailment tasks.
Significant performance drop (>25%) on comprehension Q&A tasks.
Proposed content-based subsampling to repair redacted datasets.
Abstract
Data sanitization in the context of language modeling involves identifying sensitive content, such as personally identifiable information (PII), and redacting them from a dataset corpus. It is a common practice used in natural language processing (NLP) to maintain privacy. Nevertheless, the impact of data sanitization on the language understanding capability of a language model remains less studied. This paper empirically analyzes the effects of data sanitization across several benchmark language-modeling tasks including comprehension question answering (Q&A), entailment, sentiment analysis, and text classification. Our experiments cover a wide spectrum comprising finetuning small-scale language models, to prompting large language models (LLMs), on both original and sanitized datasets, and comparing their performance across the tasks. Interestingly, our results suggest that for some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
