Leveraging Diffusion-Based Image Variations for Robust Training on   Poisoned Data

Lukas Struppek; Martin B. Hentschel; Clifton Poth; Dominik; Hintersdorf; Kristian Kersting

arXiv:2310.06372·cs.CR·December 15, 2023

Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data

Lukas Struppek, Martin B. Hentschel, Clifton Poth, Dominik, Hintersdorf, Kristian Kersting

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method using diffusion models to generate synthetic data variations and improve the robustness of neural networks against backdoor attacks, enabling safer training on potentially poisoned datasets.

Contribution

It presents a novel approach combining diffusion-based data augmentation with knowledge distillation to enhance backdoor resistance in neural network training.

Findings

01

Synthetic data variations improve backdoor detection

02

Models trained with this method resist trigger activation

03

Approach maintains high task performance

Abstract

Backdoor attacks pose a serious security threat for training neural networks as they surreptitiously introduce hidden functionalities into a model. Such backdoors remain silent during inference on clean inputs, evading detection due to inconspicuous behavior. However, once a specific trigger pattern appears in the input data, the backdoor activates, causing the model to execute its concealed function. Detecting such poisoned samples within vast datasets is virtually impossible through manual inspection. To address this challenge, we propose a novel approach that enables model training on potentially poisoned datasets by utilizing the power of recent diffusion models. Specifically, we create synthetic variations of all training samples, leveraging the inherent resilience of diffusion models to potential trigger patterns in the data. By combining this generative approach with knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lukasstruppek/robust_training_on_poisoned_samples
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks

MethodsDiffusion