iMixer: hierarchical Hopfield network implies an invertible, implicit   and iterative MLP-Mixer

Toshihiro Ota; Masato Taki

arXiv:2304.13061·cs.LG·April 2, 2024·1 cites

iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer

Toshihiro Ota, Masato Taki

PDF

Open Access 1 Repo

TL;DR

This paper introduces iMixer, a hierarchical Hopfield network-based model that generalizes MLP-Mixer, demonstrating invertible, implicit, and iterative properties, and achieves competitive image classification performance, providing insights into Transformer-like architectures.

Contribution

The paper proposes iMixer, a novel hierarchical Hopfield network-based model that generalizes MLP-Mixer with invertible and iterative features, advancing understanding of Transformer-like architectures.

Findings

01

iMixer achieves performance comparable or superior to vanilla MLP-Mixer.

02

iMixer exhibits stable learning despite its invertible, implicit, and iterative structure.

03

The study supports the theoretical link between Hopfield networks and Transformer architectures.

Abstract

In the last few years, the success of Transformers in computer vision has stimulated the discovery of many alternative models that compete with Transformers, such as the MLP-Mixer. Despite their weak inductive bias, these models have achieved performance comparable to well-studied convolutional neural networks. Recent studies on modern Hopfield networks suggest the correspondence between certain energy-based associative memory models and Transformers or MLP-Mixer, and shed some light on the theoretical background of the Transformer-type architectures design. In this paper, we generalize the correspondence to the recently introduced hierarchical Hopfield network, and find iMixer, a novel generalization of MLP-Mixer model. Unlike ordinary feedforward neural networks, iMixer involves MLP layers that propagate forward from the output side to the input side. We characterize the module as an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

toshihiro-ota/imixer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Advanced Neural Network Applications · Neural Networks and Applications

MethodsLayer Normalization · Average Pooling · Dense Connections · Global Average Pooling · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Residual Connection · MLP-Mixer