Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models

Jeffrey Wang; Jonathan Gregory; Grigorios G. Chrysos

arXiv:2605.20839·cs.CV·May 21, 2026

Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models

Jeffrey Wang, Jonathan Gregory, Grigorios G. Chrysos

PDF

TL;DR

This paper introduces activation-free polynomial modules for vision backbones within MetaFormer architectures, achieving comparable or superior performance to traditional nonlinear models across multiple vision tasks.

Contribution

It proposes polynomial alternatives to nonlinear primitives in vision backbones, enabling activation-free models that outperform prior polynomial networks at lower computational costs.

Findings

01

PolyNeXt models match or surpass activation-based models on ImageNet and ADE20K.

02

Polynomial modules outperform complex architectures with less computation.

03

Activation-free design maintains high performance without nonlinearities.

Abstract

Modern vision backbones treat pointwise activations (e.g., ReLU, GELU) and exponential softmax as essential sources of nonlinearity, but we demonstrate they are not required within MetaFormer-style vision backbones. We design activation-free polynomial alternatives for three core primitives (MLPs, convolutions, and attention), where Hadamard products replace standard nonlinearities to yield polynomial functions of the input. These modules integrate seamlessly into existing architectures: instantiated within MetaFormer, a modular framework for vision backbones, our PolyNeXt models match or exceed activation-based counterparts across model scales on ImageNet classification, ADE20K semantic segmentation, and out-of-distribution robustness. We also substantially outperform prior polynomial networks at reduced computational cost, showing that polynomial variants of standard modules beat…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.