Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Wenlong Wang; Fergal Reid

arXiv:2602.12078·cs.AI·March 16, 2026

Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Wenlong Wang, Fergal Reid

PDF

Open Access

TL;DR

This paper explores replacing Transformer blocks with Mamba-2 hybrid operators in recursive reasoning models, demonstrating improved performance and candidate coverage while maintaining reasoning capabilities.

Contribution

It introduces Mamba-2 hybrid operators into recursive reasoning models, showing they preserve reasoning ability and enhance performance over Transformer-based approaches.

Findings

01

Mamba-2 hybrid improves pass@2 by +2.0%

02

Hybrid outperforms at higher K values (+4.75%)

03

Maintains pass@1 parity with Transformer-based models

Abstract

Recent work on recursive reasoning models like TRM demonstrates that tiny networks (7M parameters) can achieve strong performance on abstract reasoning tasks through latent recursion -- iterative refinement in hidden representation space without emitting intermediate tokens. This raises a natural question about operator choice: Mamba-2's state space recurrence is itself a form of iterative refinement, making it a natural candidate for recursive reasoning -- but does introducing Mamba-2 into the recursive scaffold preserve reasoning capability? We investigate this by replacing the Transformer blocks in TRM with Mamba-2 hybrid operators while maintaining parameter parity (6.83M vs 6.86M parameters). On ARC-AGI-1, we find that the hybrid improves pass@2 (the official metric) by +2.0\% (45.88\% vs 43.88\%) and consistently outperforms at higher K values (+4.75\% at pass@100), whilst…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Constraint Satisfaction and Optimization · Neural Networks and Reservoir Computing