iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa, Adamu Lawan, Abdulganiyu Abdu Yusuf

TL;DR
iiANET is a hybrid visual backbone that efficiently combines global self-attention and local convolutional features to better capture long-range dependencies in complex images.
Contribution
The paper introduces iiABlock, a novel building block that integrates modified global r-MHSA and convolutional layers for improved feature extraction.
Findings
Achieves state-of-the-art performance on visual recognition benchmarks.
Effectively captures both global context and local details.
Maintains computational efficiency while enhancing feature modeling.
Abstract
The recent emergence of hybrid models has introduced a transformative approach to computer vision, gradually moving beyond conventional convolutional neural networks and vision transformers. However, efficiently combining these two approaches to better capture long-range dependencies in complex images remains a challenge. In this paper, we present iiANET (Inception Inspired Attention Network), an efficient hybrid visual backbone designed to improve the modeling of long-range dependencies in complex visual recognition tasks. The core innovation of iiANET is the iiABlock, a unified building block that integrates a modified global r-MHSA (Multi-Head Self-Attention) and convolutional layers in parallel. This design enables iiABlock to simultaneously capture global context and local details, making it effective for extracting rich and diverse features. By efficiently fusing these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
