AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
Yunling Zheng, Zeyi Xu, Fanghui Xue, Biao Yang, Jiancheng Lyu, Shuai, Zhang, Yingyong Qi, Jack Xin

TL;DR
This paper introduces AFIDAF, an alternating Fourier and image domain filtering method that replaces attention mechanisms in vision transformers, achieving high performance with lower computational cost.
Contribution
The paper presents a novel alternating filtering approach as an efficient alternative to attention in vision transformers, improving performance and enabling model compression.
Findings
Achieves state-of-the-art results on ImageNet-1K classification.
Improves downstream object detection and segmentation tasks.
Provides a new tool for compressing vision transformers.
Abstract
We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation consistently as well. Our approach also serves as a new tool to compress vision transformers (ViTs).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Sparse and Compressive Sensing Techniques
