Adversarial Targeted Forgetting in Regularization and Generative Based   Continual Learning Models

Muhammad Umer; Robi Polikar

arXiv:2102.08355·cs.LG·February 17, 2021

Adversarial Targeted Forgetting in Regularization and Generative Based Continual Learning Models

Muhammad Umer, Robi Polikar

PDF

TL;DR

This paper reveals that continual learning models, including regularization and generative replay methods, are vulnerable to backdoor attacks that can inject misinformation with minimal data, compromising model integrity.

Contribution

It demonstrates the susceptibility of various continual learning algorithms to backdoor attacks, extending prior work to include imperceptible misinformation and multiple model types.

Findings

01

Adversaries can insert backdoor samples into training data to control model behavior.

02

Vulnerabilities are present even with as little as 1% of training data affected.

03

Imperceptible misinformation can significantly impact model memory and task retention.

Abstract

Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data. However these approaches are typically adversary agnostic, i.e., they do not consider the possibility of a malicious attack. In our prior work, we explored the vulnerabilities of Elastic Weight Consolidation (EWC) to the perceptible misinformation. We now explore the vulnerabilities of other regularization-based as well as generative replay-based continual learning algorithms, and also extend the attack to imperceptible misinformation. We show that an intelligent adversary can take advantage of a continual learning algorithm's capabilities of retaining existing knowledge over time, and force it to learn and retain deliberately introduced misinformation. To demonstrate this vulnerability, we inject backdoor attack samples into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.