Defending Our Privacy With Backdoors

Dominik Hintersdorf; Lukas Struppek; Daniel Neider; Kristian Kersting

arXiv:2310.08320·cs.LG·July 24, 2024·1 cites

Defending Our Privacy With Backdoors

Dominik Hintersdorf, Lukas Struppek, Daniel Neider, Kristian Kersting

PDF

Open Access 1 Repo

TL;DR

This paper introduces a quick and effective backdoor-based method to remove sensitive personal information from vision-language models, enhancing privacy without extensive retraining.

Contribution

It presents a novel backdoor approach to selectively erase private data from models, offering a practical privacy defense with minimal fine-tuning.

Findings

01

Effective removal of sensitive info demonstrated on CLIP

02

Backdoor method requires only minutes of fine-tuning

03

Maintains model performance while enhancing privacy

Abstract

The proliferation of large AI models trained on uncurated, often sensitive web-scraped data has raised significant privacy concerns. One of the concerns is that adversaries can extract information about the training data using privacy attacks. Unfortunately, the task of removing specific information from the models without sacrificing performance is not straightforward and has proven to be challenging. We propose a rather easy yet effective defense based on backdoor attacks to remove private information, such as names and faces of individuals, from vision-language models by fine-tuning them for only a few minutes instead of re-training them from scratch. Specifically, by strategically inserting backdoors into text encoders, we align the embeddings of sensitive phrases with those of neutral terms-"a person" instead of the person's actual name. For image encoders, we map individuals'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

D0miH/Defending-Our-Privacy-With-Backdoors
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning

MethodsALIGN · Focus · Contrastive Language-Image Pre-training