CycleMLP: A MLP-like Architecture for Dense Prediction

Shoufa Chen; Enze Xie; Chongjian Ge; Runjian Chen; Ding Liang; Ping; Luo

arXiv:2107.10224·cs.CV·March 21, 2022·99 cites

CycleMLP: A MLP-like Architecture for Dense Prediction

Shoufa Chen, Enze Xie, Chongjian Ge, Runjian Chen, Ding Liang, Ping, Luo

PDF

Open Access 5 Repos 1 Video

TL;DR

CycleMLP introduces a versatile MLP-like architecture that handles various image sizes efficiently, surpasses existing models in dense prediction tasks, and maintains low computational complexity, making it suitable for object detection and segmentation.

Contribution

The paper proposes CycleMLP, an MLP-like architecture with linear complexity and adaptability to different image sizes, outperforming existing MLPs and Transformer models in dense visual prediction tasks.

Findings

01

CycleMLP outperforms Swin-Tiny by 1.3% mIoU on ADE20K.

02

CycleMLP achieves competitive results in object detection and segmentation.

03

CycleMLP demonstrates strong zero-shot robustness on ImageNet-C.

Abstract

This paper presents a simple MLP-like architecture, CycleMLP, which is a versatile backbone for visual recognition and dense predictions. As compared to modern MLP architectures, e.g., MLP-Mixer, ResMLP, and gMLP, whose architectures are correlated to image size and thus are infeasible in object detection and segmentation, CycleMLP has two advantages compared to modern approaches. (1) It can cope with various image sizes. (2) It achieves linear computational complexity to image size by using local windows. In contrast, previous MLPs have $O (N^{2})$ computations due to fully spatial connections. We build a family of models which surpass existing MLPs and even state-of-the-art Transformer-based models, e.g., Swin Transformer, while using fewer parameters and FLOPs. We expand the MLP-like models' applicability, making them a versatile backbone for dense prediction tasks. CycleMLP achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

CycleMLP: A MLP-like Architecture for Dense Prediction· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Linear Layer · Feedforward Network · Spatial Gating Unit · Affine Operator · gMLP · Residual Multi-Layer Perceptrons · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing