ReGLA: Efficient Receptive-Field Modeling with Gated Linear Attention Network
Junzhou Li, Manqi Zhao, Yilin Gao, Zhiheng Yu, Yin Li, Dongsheng Jiang, Li Xiao

TL;DR
ReGLA introduces a lightweight hybrid Transformer convolutional network with efficient receptive field modeling, achieving high accuracy and low latency on high-resolution images, suitable for real-time visual tasks.
Contribution
The paper presents ReGLA, a novel hybrid network combining efficient convolutions and gated linear attention, with innovations like ELRF and RGMA modules, and a multi-teacher distillation strategy.
Findings
ReGLA-M achieves 80.85% Top-1 accuracy on ImageNet-1K.
ReGLA has 4.98 ms latency at 512px resolution.
ReGLA outperforms similar models in object detection and segmentation tasks.
Abstract
Balancing accuracy and latency on high-resolution images is a critical challenge for lightweight models, particularly for Transformer-based architectures that often suffer from excessive latency. To address this issue, we introduce \textbf{ReGLA}, a series of lightweight hybrid networks, which integrates efficient convolutions for local feature extraction with ReLU-based gated linear attention for global modeling. The design incorporates three key innovations: the Efficient Large Receptive Field (ELRF) module for enhancing convolutional efficiency while preserving a large receptive field; the ReLU Gated Modulated Attention (RGMA) module for maintaining linear complexity while enhancing local feature representation; and a multi-teacher distillation strategy to boost performance on downstream tasks. Extensive experiments validate the superiority of ReGLA; particularly the ReGLA-M achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
