AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an   Efficient Alternative to Attention in ViTs

Yunling Zheng; Zeyi Xu; Fanghui Xue; Biao Yang; Jiancheng Lyu; Shuai; Zhang; Yingyong Qi; Jack Xin

arXiv:2407.12217·cs.CV·September 27, 2024

AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

Yunling Zheng, Zeyi Xu, Fanghui Xue, Biao Yang, Jiancheng Lyu, Shuai, Zhang, Yingyong Qi, Jack Xin

PDF

Open Access

TL;DR

This paper introduces AFIDAF, an alternating Fourier and image domain filtering method that replaces attention mechanisms in vision transformers, achieving high performance with lower computational cost.

Contribution

The paper presents a novel alternating filtering approach as an efficient alternative to attention in vision transformers, improving performance and enabling model compression.

Findings

01

Achieves state-of-the-art results on ImageNet-1K classification.

02

Improves downstream object detection and segmentation tasks.

03

Provides a new tool for compressing vision transformers.

Abstract

We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation consistently as well. Our approach also serves as a new tool to compress vision transformers (ViTs).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Sparse and Compressive Sensing Techniques