SplitMixer: Fat Trimmed From MLP-like Models

Ali Borji; Sikun Lin

arXiv:2207.10255·cs.CV·July 26, 2022·1 cites

SplitMixer: Fat Trimmed From MLP-like Models

Ali Borji, Sikun Lin

PDF

Open Access 1 Repo

TL;DR

SplitMixer introduces a lightweight, flexible MLP-like architecture for visual recognition that balances accuracy and efficiency, outperforming or matching state-of-the-art models with fewer parameters and FLOPS.

Contribution

The paper proposes SplitMixer, a novel isotropic MLP-like architecture with innovative spatial and channel mixing techniques, achieving high accuracy with fewer parameters.

Findings

01

SplitMixer achieves around 94% accuracy on CIFAR-10 with 0.28M parameters.

02

SplitMixer matches ConvMixer's CIFAR-10 accuracy with fewer parameters.

03

SplitMixer attains 73% accuracy on CIFAR-100, with about 52% fewer parameters and FLOPS.

Abstract

We present SplitMixer, a simple and lightweight isotropic MLP-like architecture, for visual recognition. It contains two types of interleaving convolutional operations to mix information across spatial locations (spatial mixing) and channels (channel mixing). The first one includes sequentially applying two depthwise 1D kernels, instead of a 2D kernel, to mix spatial information. The second one is splitting the channels into overlapping or non-overlapping segments, with or without shared parameters, and applying our proposed channel mixing approaches or 3D convolution to mix channel information. Depending on design choices, a number of SplitMixer variants can be constructed to balance accuracy, the number of parameters, and speed. We show, both theoretically and experimentally, that SplitMixer performs on par with the state-of-the-art MLP-like models while having a significantly lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aliborji/splitmixer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Advanced Vision and Imaging

MethodsAverage Pooling · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Global Average Pooling · Dense Connections · Convolution · Dropout · Layer Normalization · MLP-Mixer · 3D Convolution