SplitMixer: Fat Trimmed From MLP-like Models
Ali Borji, Sikun Lin

TL;DR
SplitMixer introduces a lightweight, flexible MLP-like architecture for visual recognition that balances accuracy and efficiency, outperforming or matching state-of-the-art models with fewer parameters and FLOPS.
Contribution
The paper proposes SplitMixer, a novel isotropic MLP-like architecture with innovative spatial and channel mixing techniques, achieving high accuracy with fewer parameters.
Findings
SplitMixer achieves around 94% accuracy on CIFAR-10 with 0.28M parameters.
SplitMixer matches ConvMixer's CIFAR-10 accuracy with fewer parameters.
SplitMixer attains 73% accuracy on CIFAR-100, with about 52% fewer parameters and FLOPS.
Abstract
We present SplitMixer, a simple and lightweight isotropic MLP-like architecture, for visual recognition. It contains two types of interleaving convolutional operations to mix information across spatial locations (spatial mixing) and channels (channel mixing). The first one includes sequentially applying two depthwise 1D kernels, instead of a 2D kernel, to mix spatial information. The second one is splitting the channels into overlapping or non-overlapping segments, with or without shared parameters, and applying our proposed channel mixing approaches or 3D convolution to mix channel information. Depending on design choices, a number of SplitMixer variants can be constructed to balance accuracy, the number of parameters, and speed. We show, both theoretically and experimentally, that SplitMixer performs on par with the state-of-the-art MLP-like models while having a significantly lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Advanced Vision and Imaging
MethodsAverage Pooling · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Global Average Pooling · Dense Connections · Convolution · Dropout · Layer Normalization · MLP-Mixer · 3D Convolution
