Adversarial Targeted Forgetting in Regularization and Generative Based Continual Learning Models
Muhammad Umer, Robi Polikar

TL;DR
This paper reveals that continual learning models, including regularization and generative replay methods, are vulnerable to backdoor attacks that can inject misinformation with minimal data, compromising model integrity.
Contribution
It demonstrates the susceptibility of various continual learning algorithms to backdoor attacks, extending prior work to include imperceptible misinformation and multiple model types.
Findings
Adversaries can insert backdoor samples into training data to control model behavior.
Vulnerabilities are present even with as little as 1% of training data affected.
Imperceptible misinformation can significantly impact model memory and task retention.
Abstract
Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data. However these approaches are typically adversary agnostic, i.e., they do not consider the possibility of a malicious attack. In our prior work, we explored the vulnerabilities of Elastic Weight Consolidation (EWC) to the perceptible misinformation. We now explore the vulnerabilities of other regularization-based as well as generative replay-based continual learning algorithms, and also extend the attack to imperceptible misinformation. We show that an intelligent adversary can take advantage of a continual learning algorithm's capabilities of retaining existing knowledge over time, and force it to learn and retain deliberately introduced misinformation. To demonstrate this vulnerability, we inject backdoor attack samples into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
