Accelerating Toeplitz Neural Network with Constant-time Inference Complexity
Zhen Qin, Yiran Zhong

TL;DR
This paper introduces a method to convert Toeplitz Neural Networks into State Space Models during inference, enabling constant-time inference without retraining, thus combining the strengths of both models for language tasks.
Contribution
It formulates the conversion as an optimization problem with a closed-form solution using DFT, applicable to any LongConv-based model, and demonstrates improved inference efficiency.
Findings
Achieves constant inference complexity for TNNs
Maintains numerical stability without retraining
Outperforms gradient-descent methods in stability
Abstract
Toeplitz Neural Networks (TNNs) have exhibited outstanding performance in various sequence modeling tasks. They outperform commonly used Transformer-based models while benefiting from log-linear space-time complexities. On the other hand, State Space Models (SSMs) achieve lower performance than TNNs in language modeling but offer the advantage of constant inference complexity. In this paper, we aim to combine the strengths of TNNs and SSMs by converting TNNs to SSMs during inference, thereby enabling TNNs to achieve the same constant inference complexities as SSMs. To accomplish this, we formulate the conversion process as an optimization problem and provide a closed-form solution. We demonstrate how to transform the target equation into a Vandermonde linear system problem, which can be efficiently solved using the Discrete Fourier Transform (DFT). Notably, our method requires no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Topic Modeling
