Vision KAN: Towards an Attention-Free Backbone for Vision with Kolmogorov-Arnold Networks

Zhuoqin Yang; Jiansong Zhang; Xiaoling Luo; Xu Wu; Zheng Lu; Linlin Shen

arXiv:2601.21541·cs.CV·January 30, 2026

Vision KAN: Towards an Attention-Free Backbone for Vision with Kolmogorov-Arnold Networks

Zhuoqin Yang, Jiansong Zhang, Xiaoling Luo, Xu Wu, Zheng Lu, Linlin Shen

PDF

Open Access

TL;DR

Vision KAN introduces an attention-free vision backbone using Kolmogorov-Arnold Networks, achieving competitive accuracy with linear complexity and addressing scalability and interpretability issues of attention mechanisms.

Contribution

The paper proposes a novel attention-free backbone, ViK, based on KANs, combining patch-wise nonlinear transforms, local propagation, and global mapping for efficient vision modeling.

Findings

01

Achieves competitive ImageNet-1K accuracy

02

Maintains linear complexity in feature resolution

03

Offers a theoretically grounded alternative to attention mechanisms

Abstract

Attention mechanisms have become a key module in modern vision backbones due to their ability to model long-range dependencies. However, their quadratic complexity in sequence length and the difficulty of interpreting attention weights limit both scalability and clarity. Recent attention-free architectures demonstrate that strong performance can be achieved without pairwise attention, motivating the search for alternatives. In this work, we introduce Vision KAN (ViK), an attention-free backbone inspired by the Kolmogorov-Arnold Networks. At its core lies MultiPatch-RBFKAN, a unified token mixer that combines (a) patch-wise nonlinear transform with Radial Basis Function-based KANs, (b) axis-wise separable mixing for efficient local propagation, and (c) low-rank global mapping for long-range interaction. Employing as a drop-in replacement for attention modules, this formulation tackles…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis