Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models
Haidong Kang, Wei Wu, Hanling Wang

TL;DR
This paper introduces ACraft, an automated attack method leveraging large language models and reinforcement learning to effectively compromise few-shot class-incremental learning systems, revealing security vulnerabilities.
Contribution
The paper proposes a novel automated attack framework using LLMs and reinforcement learning to discover optimal attacks on FSCIL, reducing reliance on expert knowledge.
Findings
ACraft significantly degrades FSCIL performance.
It outperforms human expert-designed attack methods.
The method maintains low attack costs.
Abstract
Few-shot class incremental learning (FSCIL) is a more realistic and challenging paradigm in continual learning to incrementally learn unseen classes and overcome catastrophic forgetting on base classes with only a few training examples. Previous efforts have primarily centered around studying more effective FSCIL approaches. By contrast, less attention was devoted to thinking the security issues in contributing to FSCIL. This paper aims to provide a holistic study of the impact of attacks on FSCIL. We first derive insights by systematically exploring how human expert-designed attack methods (i.e., PGD, FGSM) affect FSCIL. We find that those methods either fail to attack base classes, or suffer from huge labor costs due to relying on huge expert knowledge. This highlights the need to craft a specialized attack method for FSCIL. Grounded in these insights, in this paper, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Imbalanced Data Classification Techniques · Adversarial Robustness in Machine Learning
