Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads

Ali K. Rahimian; Manish K. Govind; Subhajit Maity; Dominick Reilly; Christian K\"ummerle; Srijan Das; Aritra Dutta

arXiv:2406.19391·cs.CV·February 16, 2026

Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads

Ali K. Rahimian, Manish K. Govind, Subhajit Maity, Dominick Reilly, Christian K\"ummerle, Srijan Das, Aritra Dutta

PDF

Open Access 1 Repo

TL;DR

Fibottention introduces a structured sparse self-attention mechanism based on the Wythoff array, reducing computational complexity and enhancing diversity in vision transformers, leading to improved efficiency and performance across multiple visual tasks.

Contribution

The paper proposes Fibottention, a novel sparse self-attention method with diverse head patterns, achieving lower complexity and better feature diversity in vision transformers.

Findings

01

Achieves $ ext{O}(N ext{log} N)$ complexity in self-attention.

02

Models with Fibottention outperform or match dense models with only 2% of pairwise interactions.

03

Consistently superior results compared to existing sparse attention methods.

Abstract

Vision Transformers and their variants have achieved remarkable success in diverse visual perception tasks. Despite their effectiveness, they suffer from two significant limitations. First, the quadratic computational complexity of multi-head self-attention (MHSA), which restricts scalability to large token counts, and second, a high dependency on large-scale training data to attain competitive performance. In this paper, to address these challenges, we propose a novel sparse self-attention mechanism named Fibottention. Fibottention employs structured sparsity patterns derived from the Wythoff array, enabling an $O (N lo g N)$ computational complexity in self-attention. By design, its sparsity patterns vary across attention heads, which provably reduces redundant pairwise interactions while ensuring sufficient and diverse coverage. This leads to an \emph{inception-like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

charlotte-charmlab/fibottention
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Weight Decay · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Attention Dropout · Position-Wise Feed-Forward Layer