Safe Training with Sensitive In-domain Data: Leveraging Data Fragmentation To Mitigate Linkage Attacks
Mariia Ignashina, Julia Ive

TL;DR
This paper proposes a method to enhance privacy in text generation models by training on fragmented, domain-specific data to prevent sensitive information leakage and linkage attacks.
Contribution
It introduces a data fragmentation approach for training language models, reducing re-identification risk while maintaining classification performance.
Findings
Fragmented data training achieves comparable results to full data training.
Models trained on fragments are less susceptible to linkage attacks.
Fine-tuned models effectively predict cardiovascular diagnoses.
Abstract
Current text generation models are trained using real data which can potentially contain sensitive information, such as confidential patient information and the like. Under certain conditions output of the training data which they have memorised can be triggered, exposing sensitive data. To mitigate against this risk we propose a safer alternative which sees fragmented data in the form of domain-specific short phrases randomly grouped together shared instead of full texts. Thus, text fragments that could re-identify an individual cannot be reproduced by the model in one sequence, giving significant protection against linkage attacks. We fine-tune several state-of-the-art LLMs using meaningful syntactic chunks to explore their utility. In particular, we fine-tune BERT-based models to predict two cardiovascular diagnoses. Our results demonstrate the capacity of LLMs to benefit from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
