Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula
Sam Blouir, Jimmy T.H. Smith, Antonios Anastasopoulos, Amarda Shehu

TL;DR
Birdie introduces a novel training method for state space models that significantly improves their ability to handle long-range retrieval tasks without changing their architecture, narrowing the gap with Transformers while maintaining efficiency.
Contribution
The paper presents a new training procedure called Birdie that enhances SSMs' in-context retrieval abilities through bidirectional processing and reinforcement learning, without architectural modifications.
Findings
Improved performance on retrieval tasks like phone book lookup and question answering.
Narrowed the performance gap between SSMs and Transformers.
Retained computational efficiency of SSMs.
Abstract
Efficient state space models (SSMs), such as linear recurrent neural networks and linear attention variants, offer computational advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts. Previous efforts to address these challenges have focused on architectural modifications, often reintroducing computational inefficiencies. In this paper, we propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture. Our approach combines bidirectional input processing with dynamic mixtures of specialized pre-training objectives, optimized via reinforcement learning. We introduce a new bidirectional SSM architecture that seamlessly transitions from bidirectional context processing to causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Organizational Learning and Leadership
MethodsSoftmax · Attention Is All You Need
