Strip-MLP: Efficient Token Interaction for Vision MLP

Guiping Cao; Shengda Luo; Wenjian Huang; Xiangyuan Lan; Dongmei Jiang,; Yaowei Wang; Jianguo Zhang

arXiv:2307.11458·cs.CV·July 24, 2023

Strip-MLP: Efficient Token Interaction for Vision MLP

Guiping Cao, Shengda Luo, Wenjian Huang, Xiangyuan Lan, Dongmei Jiang,, Yaowei Wang, Jianguo Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

Strip-MLP introduces a novel token interaction method with cross-strip, cross-patch, and local region modules, significantly enhancing the expressive power of MLP-based vision models, especially on small datasets.

Contribution

The paper proposes Strip-MLP, a new MLP paradigm with innovative modules that improve token interaction regardless of spatial resolution, outperforming existing models.

Findings

01

Achieves +2.44% Top-1 accuracy on Caltech-101

02

Achieves +2.16% Top-1 accuracy on CIFAR-100

03

Outperforms existing MLP-based models on multiple datasets.

Abstract

Token interaction operation is one of the core modules in MLP-based models to exchange and aggregate information between different spatial locations. However, the power of token interaction on the spatial dimension is highly dependent on the spatial resolution of the feature maps, which limits the model's expressive ability, especially in deep layers where the feature are down-sampled to a small spatial size. To address this issue, we present a novel method called \textbf{Strip-MLP} to enrich the token interaction power in three ways. Firstly, we introduce a new MLP paradigm called Strip MLP layer that allows the token to interact with other tokens in a cross-strip manner, enabling the tokens in a row (or column) to contribute to the information aggregations in adjacent but different strips of rows (or columns). Secondly, a \textbf{C}ascade \textbf{G}roup \textbf{S}trip \textbf{M}ixing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

med-process/strip_mlp
pytorchOfficial

Videos

Strip-MLP: Efficient Token Interaction for Vision MLP· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · AI in cancer detection · Domain Adaptation and Few-Shot Learning