The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels

Yonatan Slutzky; Yotam Alexander; Noam Razin; Nadav Cohen

arXiv:2410.10473·cs.LG·December 16, 2025

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels

Yonatan Slutzky, Yotam Alexander, Noam Razin, Nadav Cohen

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper reveals that structured state space models (SSMs), despite their efficiency and generalization capabilities, can be completely misled by specially chosen clean-labeled training examples, leading to failure in generalization.

Contribution

The study formally uncovers a new vulnerability of SSMs to clean-label poisoning, demonstrating that their implicit bias can be entirely distorted by specific training examples.

Findings

01

SSMs can be poisoned with clean labels, causing generalization failure.

02

Empirical evidence shows the phenomenon occurs in both independent and integrated training.

03

The vulnerability poses significant security concerns for widespread SSM deployment.

Abstract

Neural networks are powered by an implicit bias: a tendency of gradient descent to fit training data in a way that generalizes to unseen data. A recent class of neural network models gaining increasing popularity is structured state space models (SSMs), regarded as an efficient alternative to transformers. Prior work argued that the implicit bias of SSMs leads to generalization in a setting where data is generated by a low dimensional teacher. In this paper, we revisit the latter setting, and formally establish a phenomenon entirely undetected by prior work on the implicit bias of SSMs. Namely, we prove that while implicit bias leads to generalization under many choices of training data, there exist special examples whose inclusion in training completely distorts the implicit bias, to a point where generalization fails. This failure occurs despite the special training examples being…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 8Confidence 2

Strengths

The theoretical analysis of the implicit bias of SSMs is a strength of this work. Providing a basic understanding of SSMs, which are often considered as an alternative to transformers, is indeed important for the community. The paper is nicely written, while I'm not an expert in this area, I could easily follow the logical steps of the paper based on proper interpretation and the sketches of the mathematical details. The experimental results that corroborates the theoretical analyses are a str

Weaknesses

Again, I'm not an expert in this area, but I'd like to make a few points that might be helpful to the authors and also clarify my understanding. 1. Is it possible to separately decompose and plot $\gamma^{(0)}(t)$ along with Figure 2? If so, I think it could give a more concrete explanation to aid the Interpretation part of Section 3.1. 2. Can the authors come up with a measure to qualitatively distinguish between the scenarios of the (leftmost) and (second) subplot of Figure 2? I'm aware that

Reviewer 02Rating 6Confidence 1

Strengths

1). This paper discovered clean-label poisoning of SSMs, which is a vulnerability unrecognized by previous works, highlighting a potential risk in the safety, robustness, and reliability of SSMs. 2). This paper provides a strong theoretical foundation and empirical validation to show that generalization can be ruined by introducing certain clean-labeled sequences.

Weaknesses

This paper falls outside my current area of expertise.

Reviewer 03Rating 6Confidence 3

Strengths

1. This paper presents a solid theoretical work, extending the results on the gradient descent (GD) implicit bias of Structured State Space Models (SSMs) from the population risk setup, as discussed in [1], to the finite empirical risk setup. The authors demonstrate that training a student SSM on sequences labeled by a low-dimensional teacher SSM exhibits an implicit bias conducive to generalization. Their dynamical analysis also establishes a connection with greedy low-rank learning. 2. Using

Weaknesses

1. The assumptions in Theorem 1 are overly restrictive, particularly as the structures of $ A^* $, $ B^* $, and $ C^* $ seem to be very simple, and there is a lack of detailed explanation as to why such a simplified setup is justified in this context. 2. Regarding the special sequence data, Theorem 1 only provides an existence result without specifying a concrete construction method or offering a more detailed characterization, making it difficult for me to understand why the introduction of th

Code & Models

Repositories

yonislutzky98/imp-bias-ssm-poison
tfOfficial

Videos

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels· slideslive

Taxonomy

TopicsNeural Networks and Applications