L2T-Hyena: Enhancing State-Space Models with an Adaptive Learn-to-Teach Framework

Fatemeh Sohbati; Farzan Haddadi; Hamid Salahinejad

arXiv:2511.05926·cs.IT·February 25, 2026

L2T-Hyena: Enhancing State-Space Models with an Adaptive Learn-to-Teach Framework

Fatemeh Sohbati, Farzan Haddadi, Hamid Salahinejad

PDF

Open Access

TL;DR

This paper introduces L2T-Hyena, a hybrid state-space model with an adaptive loss function guided by a teacher network, significantly improving language modeling performance on benchmark datasets.

Contribution

It proposes a novel adaptive loss framework for state-space models using a learn-to-teach paradigm, enhancing training effectiveness and model performance.

Findings

01

L2T-Hyena outperforms vanilla Hyena and Transformer baselines on PTB and WikiText-103.

02

Adaptive loss functions improve sequence modeling accuracy.

03

The approach demonstrates significant gains in perplexity metrics.

Abstract

State-space models (SSMs) have recently emerged as efficient alternatives to computationally intensive architectures such as Transformers for sequence modeling. However, their training typically relies on static loss functions, which may be suboptimal at different stages of learning. In this work, we introduce a hybrid model that integrates the Hyena architecture with a Dynamic Loss Network (DLN) under a Learning-to-Teach (L2T) paradigm, referred to as L2T-DLN. In this framework, the Hyena model serves as a student whose loss function is adapted online, while a teacher model, equipped with a memory of the student's past performance, guides the DLN to dynamically trade off the primary cross-entropy objective and a regularization term. We evaluate the proposed L2T-Hyena model on the Penn Treebank (PTB) and WikiText-103 language modeling benchmarks and compare it against both a vanilla…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Topic Modeling · Machine Learning and Data Classification