SpaRTAN: Spatial Reinforcement Token-based Aggregation Network for Visual Recognition
Quan Bi Pay, Vishnu Monn Baskaran, Junn Yong Loo, KokSheik Wong, Simon See

TL;DR
SpaRTAN introduces a lightweight neural network architecture that enhances spatial and channel-wise feature processing using multi-scale kernels and wave-based channel aggregation, achieving high accuracy with fewer parameters in visual recognition tasks.
Contribution
The paper presents SpaRTAN, a novel architecture that effectively captures multi-order spatial features and reduces channel redundancies, improving efficiency and performance over existing models.
Findings
Achieves 77.7% accuracy on ImageNet-1k with only 3.8M parameters.
Surpasses previous benchmarks on COCO with 50.0% AP using 21.5M parameters.
Demonstrates high parameter efficiency and competitive performance in visual recognition.
Abstract
The resurgence of convolutional neural networks (CNNs) in visual recognition tasks, exemplified by ConvNeXt, has demonstrated their capability to rival transformer-based architectures through advanced training methodologies and ViT-inspired design principles. However, both CNNs and transformers exhibit a simplicity bias, favoring straightforward features over complex structural representations. Furthermore, modern CNNs often integrate MLP-like blocks akin to those in transformers, but these blocks suffer from significant information redundancies, necessitating high expansion ratios to sustain competitive performance. To address these limitations, we propose SpaRTAN, a lightweight architectural design that enhances spatial and channel-wise information processing. SpaRTAN employs kernels with varying receptive fields, controlled by kernel size and dilation factor, to capture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Adversarial Robustness in Machine Learning
MethodsConvNeXt
