Invertible Memory Flow Networks

Liyu Zerihun; Alexandr Plashchinsky

arXiv:2602.00535·cs.LG·February 3, 2026

Invertible Memory Flow Networks

Liyu Zerihun, Alexandr Plashchinsky

PDF

Open Access

TL;DR

The paper introduces Invertible Memory Flow Networks, a novel approach for long sequence compression that decomposes the task into simpler pairwise merges, enabling efficient and scalable sequence modeling.

Contribution

It proposes a new invertible network architecture that decomposes sequence compression into pairwise merges using a binary tree structure, improving scalability and efficiency.

Findings

01

Validated on long MNIST sequences

02

Achieved effective high-dimensional data compression

03

Demonstrated constant-cost inference with distilled models

Abstract

Long sequence neural memory remains a challenging problem. RNNs and their variants suffer from vanishing gradients, and Transformers suffer from quadratic scaling. Furthermore, compressing long sequences into a finite fixed representation remains an intractable problem due to the difficult optimization landscape. Invertible Memory Flow Networks (IMFN) make long sequence compression tractable through factorization: instead of learning end-to-end compression, we decompose the problem into pairwise merges using a binary tree of "sweeper" modules. Rather than learning to compress long sequences, each sweeper learns a much simpler 2-to-1 compression task, achieving O(log N) depth with sublinear error accumulation in sequence length. For online inference, we distilled into a constant-cost recurrent student achieving O(1) sequential steps. Empirical results validate IMFN on long MNIST…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Neural Networks and Reservoir Computing · Generative Adversarial Networks and Image Synthesis