Exploring Adversarial Robustness of Deep State Space Models

Biqing Qi; Yang Luo; Junqi Gao; Pengfei Li; Kai Tian; Zhiyuan Ma,; Bowen Zhou

arXiv:2406.05532·cs.LG·October 10, 2024·1 cites

Exploring Adversarial Robustness of Deep State Space Models

Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma,, Bowen Zhou

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates the adversarial robustness of Deep State Space Models (SSMs), analyzing how different components like Attention affect robustness and proposing an adaptive scaling method to improve defense against adversarial attacks.

Contribution

It provides a comprehensive empirical and theoretical analysis of SSMs under adversarial training, revealing the role of Attention and proposing a new adaptive scaling mechanism to enhance robustness.

Findings

01

Attention improves robustness-generalization trade-off in SSMs.

02

Fixed-parameter SSMs have output error bounds related to their parameters.

03

Input-dependent SSMs may experience error explosion under adversarial perturbations.

Abstract

Deep State Space Models (SSMs) have proven effective in numerous task scenarios but face significant security challenges due to Adversarial Perturbations (APs) in real-world deployments. Adversarial Training (AT) is a mainstream approach to enhancing Adversarial Robustness (AR) and has been validated on various traditional DNN architectures. However, its effectiveness in improving the AR of SSMs remains unclear. While many enhancements in SSM components, such as integrating Attention mechanisms and expanding to data-dependent SSM parameterizations, have brought significant gains in Standard Training (ST) settings, their potential benefits in AT remain unexplored. To investigate this, we evaluate existing structural variants of SSMs with AT to assess their AR performance. We observe that pure SSM structures struggle to benefit from AT, whereas incorporating Attention yields a markedly…

Peer Reviews

Decision·NeurIPS 2024 poster

Reviewer 01Rating 5Confidence 4

Strengths

- Originality: The paper addresses an important and under-explored area in the intersection of SSMs (that have become very prominent recently, both for text and vision) and adversarial robustness. It provides novel insights into how different SSM components behave under adversarial attacks and training, including important observations about robust overfitting. - Quality: The work demonstrates high-quality research through its comprehensive empirical evaluations and some theoretical analysis. Th

Weaknesses

- I don’t quite understand the claim about robust overfitting: based on Table 1, using AutoAttack, the “Diff” is always very small, almost always <1%. Why is there robust overfitting then? - Limited scope of experiments: While the paper provides comprehensive evaluations on MNIST and CIFAR-10 datasets, it lacks experiments on more complex datasets (e.g., at least Tiny ImageNet or some dataset with a higher image resolution) or real-world scenarios. This limitation somewhat restricts the generali

Reviewer 02Rating 6Confidence 3

Strengths

1. The paper is well organized and easy to follow. 2. The experiments on serval image classification benchmarks are solid and comprehensive. 3. The theory analysis and visualization is helpful.

Weaknesses

1. Tested dataset size is small. 2. Better to add a result of without AdS in Table 2 so we can better observe the improvement. 3. Lack of analysis why different activation function in AdS influence performance.

Reviewer 03Rating 5Confidence 5

Strengths

1. The paper is well-written with a clear storyline and a suitable motivation. 2. The trustworthiness (e.g., robustness, explainability) research for SSM is an important topic and is yet to be fully explored. 3. The theoretical proof of the generalization bound is clear.

Weaknesses

1. The evaluation dataset is not sufficiently scaled up to be representative. It would be more convincing if the paper developed similar findings on larger datasets such as CIFAR-100 and Tiny-ImageNet. 2. The involved adversarial training methods lack novelty. While PGD-AT and TRADES are classic AT methods, other representative variants can significantly improve AT efficiency and its robust performance. To name a few, Free-AT [1] substantially improves the training efficiency, and YOPO [2] boo

Code & Models

Repositories

biqing-qi/exploring-adversarial-robustness-of-deep-state-space-models
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning