Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion

Zizhao Hu; Mohammad Rostami

arXiv:2405.16098·cs.CV·May 28, 2024

Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion

Zizhao Hu, Mohammad Rostami

PDF

Open Access 1 Repo

TL;DR

The paper introduces Lateralization MLP (L-MLP), a brain-inspired architecture that rivals transformers in diffusion tasks while being more efficient, by mimicking human brain lateralization in a simple, scalable MLP design.

Contribution

Proposes the Lateralization MLP (L-MLP), a novel brain-inspired architecture that outperforms other MLP variants and matches transformer performance in diffusion tasks.

Findings

01

L-MLP outperforms other MLP variants.

02

L-MLP performs comparably to transformers in diffusion tasks.

03

L-MLP is highly efficient and effective.

Abstract

The Transformer architecture has dominated machine learning in a wide range of tasks. The specific characteristic of this architecture is an expensive scaled dot-product attention mechanism that models the inter-token interactions, which is known to be the reason behind its success. However, such a mechanism does not have a direct parallel to the human brain which brings the question if the scaled-dot product is necessary for intelligence with strong expressive power. Inspired by the lateralization of the human brain, we propose a new simple but effective architecture called the Lateralization MLP (L-MLP). Stacking L-MLP blocks can generate complex architectures. Each L-MLP block is based on a multi-layer perceptron (MLP) that permutes data dimensions, processes each dimension in parallel, merges them, and finally passes through a joint MLP. We discover that this specific design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zizhao-hu/l-mlp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections