Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer
Sainath Dey, Mitul Goswami, Jashika Sethi, Prasant Kumar Pattnaik

TL;DR
Hyb-KAN ViT introduces a modular hybrid architecture combining wavelet spectral decomposition and spline activations to improve multi-scale feature extraction and efficiency in vision transformer tasks, achieving state-of-the-art results.
Contribution
The paper presents a novel hybrid ViT framework integrating wavelet transforms and spline functions, enhancing spectral modeling and modularity over prior architectures.
Findings
State-of-the-art performance on ImageNet-1K, COCO, and ADE20K datasets.
Wavelet spectral priors improve segmentation accuracy.
Spline-based modules enhance detection efficiency.
Abstract
This study addresses the inherent limitations of Multi-Layer Perceptrons (MLPs) in Vision Transformers (ViTs) by introducing Hybrid Kolmogorov-Arnold Network (KAN)-ViT (Hyb-KAN ViT), a novel framework that integrates wavelet-based spectral decomposition and spline-optimized activation functions, prior work has failed to focus on the prebuilt modularity of the ViT architecture and integration of edge detection capabilities of Wavelet functions. We propose two key modules: Efficient-KAN (Eff-KAN), which replaces MLP layers with spline functions and Wavelet-KAN (Wav-KAN), leveraging orthogonal wavelet transforms for multi-resolution feature extraction. These modules are systematically integrated in ViT encoder layers and classification heads to enhance spatial-frequency modeling while mitigating computational bottlenecks. Experiments on ImageNet-1K (Image Recognition), COCO (Object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Face Recognition and Perception · Advanced Memory and Neural Computing
MethodsFocus
