TL;DR
MinMax Recurrent Neural Cascades (RNCs) are a novel recurrence model that is highly expressive, computationally efficient, and resistant to vanishing/exploding gradients, with promising empirical results.
Contribution
The paper introduces MinMax RNCs, a new recurrence framework with strong theoretical properties and demonstrated practical capabilities on synthetic and real-world tasks.
Findings
MinMax RNCs can express all regular languages.
They can be evaluated in parallel with logarithmic runtime.
Empirical results show superior performance on synthetic tasks and competitive results on next-token prediction.
Abstract
We show that the MinMax algebra provides a form of recurrence that is expressively powerful, efficiently implementable, and most importantly it is not affected by vanishing or exploding gradient. We call MinMax Recurrent Neural Cascades (RNCs) the models obtained by cascading several layers of neurons that employ such recurrence. We show that MinMax RNCs enjoy many favourable theoretical properties. First, their formal expressivity includes all regular languages, arguably the maximal expressivity for a finite-memory system. Second, they can be evaluated in parallel with a runtime that is logarithmic in the input length given enough processors; and they can also be evaluated sequentially. Third, their state and activations are bounded uniformly for all input lengths. Fourth, at almost all points, their loss gradient exists and it is bounded. Fifth, they do not exhibit a vanishing state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
