The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model

Hongxu Zhou

arXiv:2604.05923·cs.LG·April 8, 2026

The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model

Hongxu Zhou

PDF

TL;DR

This paper introduces the UNDO Flip-Flop task to evaluate whether state space models can reliably learn reversible semantic state retrieval, revealing systematic failures in current models despite their theoretical expressivity.

Contribution

The paper presents a new benchmark task that isolates reversible state retrieval in state space models, highlighting the gap between theoretical capacity and practical learnability.

Findings

01

Models fail to learn the stack-based rollback mechanism.

02

Two-layer models collapse under adversarial retraction, achieving below-chance accuracy.

03

Retrieval, not storage, is the bottleneck in learning reversible states.

Abstract

State space models (SSMs) have been shown to possess the theoretical capacity to model both star-free sequential tasks and bounded hierarchical structures Sarrof et al. (2024). However, formal expressivity results do not guarantee that gradient-based optimisation will reliably discover the corresponding solutions. Existing benchmarks probe either monotonic state tracking, as in the standard Flip-Flop task, or structural nesting, as in the Dyck languages, but neither isolates reversible semantic state retrieval. We introduce the UNDO Flip-Flop task to fill this gap. By extending the standard Flip-Flop with an UNDO, the task requires a model to maintain an implicit bounded stack and recover historical states under non-monotonic update sequences. We evaluate one-layer and two-layer Mamba-2 under this framework. Both variants fail to acquire the provably expressible stack-based rollback…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.