The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels
Yonatan Slutzky, Yotam Alexander, Noam Razin, Nadav Cohen

TL;DR
This paper reveals that structured state space models (SSMs), despite their efficiency and generalization capabilities, can be completely misled by specially chosen clean-labeled training examples, leading to failure in generalization.
Contribution
The study formally uncovers a new vulnerability of SSMs to clean-label poisoning, demonstrating that their implicit bias can be entirely distorted by specific training examples.
Findings
SSMs can be poisoned with clean labels, causing generalization failure.
Empirical evidence shows the phenomenon occurs in both independent and integrated training.
The vulnerability poses significant security concerns for widespread SSM deployment.
Abstract
Neural networks are powered by an implicit bias: a tendency of gradient descent to fit training data in a way that generalizes to unseen data. A recent class of neural network models gaining increasing popularity is structured state space models (SSMs), regarded as an efficient alternative to transformers. Prior work argued that the implicit bias of SSMs leads to generalization in a setting where data is generated by a low dimensional teacher. In this paper, we revisit the latter setting, and formally establish a phenomenon entirely undetected by prior work on the implicit bias of SSMs. Namely, we prove that while implicit bias leads to generalization under many choices of training data, there exist special examples whose inclusion in training completely distorts the implicit bias, to a point where generalization fails. This failure occurs despite the special training examples being…
Peer Reviews
Decision·Submitted to ICLR 2025
The theoretical analysis of the implicit bias of SSMs is a strength of this work. Providing a basic understanding of SSMs, which are often considered as an alternative to transformers, is indeed important for the community. The paper is nicely written, while I'm not an expert in this area, I could easily follow the logical steps of the paper based on proper interpretation and the sketches of the mathematical details. The experimental results that corroborates the theoretical analyses are a str
Again, I'm not an expert in this area, but I'd like to make a few points that might be helpful to the authors and also clarify my understanding. 1. Is it possible to separately decompose and plot $\gamma^{(0)}(t)$ along with Figure 2? If so, I think it could give a more concrete explanation to aid the Interpretation part of Section 3.1. 2. Can the authors come up with a measure to qualitatively distinguish between the scenarios of the (leftmost) and (second) subplot of Figure 2? I'm aware that
1). This paper discovered clean-label poisoning of SSMs, which is a vulnerability unrecognized by previous works, highlighting a potential risk in the safety, robustness, and reliability of SSMs. 2). This paper provides a strong theoretical foundation and empirical validation to show that generalization can be ruined by introducing certain clean-labeled sequences.
This paper falls outside my current area of expertise.
1. This paper presents a solid theoretical work, extending the results on the gradient descent (GD) implicit bias of Structured State Space Models (SSMs) from the population risk setup, as discussed in [1], to the finite empirical risk setup. The authors demonstrate that training a student SSM on sequences labeled by a low-dimensional teacher SSM exhibits an implicit bias conducive to generalization. Their dynamical analysis also establishes a connection with greedy low-rank learning. 2. Using
1. The assumptions in Theorem 1 are overly restrictive, particularly as the structures of $ A^* $, $ B^* $, and $ C^* $ seem to be very simple, and there is a lack of detailed explanation as to why such a simplified setup is justified in this context. 2. Regarding the special sequence data, Theorem 1 only provides an existence result without specifying a concrete construction method or offering a more detailed characterization, making it difficult for me to understand why the introduction of th
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications
