BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation

Ping-Mao Huang; I-Tien Chao; Ping-Chia Huang; Jia-Wei Liao; Yung-Yu Chuang

arXiv:2508.07300·cs.CV·August 21, 2025

BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation

Ping-Mao Huang, I-Tien Chao, Ping-Chia Huang, Jia-Wei Liao, Yung-Yu Chuang

PDF

1 Models

TL;DR

BEVANet is a novel real-time semantic segmentation network that combines bilateral visual attention mechanisms, large kernel modules, and boundary-aware fusion to achieve high accuracy and efficiency without pretraining.

Contribution

It introduces the Large Kernel Attention mechanism and a bilateral architecture with adaptive feature fusion, advancing real-time segmentation performance.

Findings

01

Achieves 79.3% mIoU without pretraining

02

Runs at 33 FPS in real-time

03

Outperforms existing methods on Cityscapes dataset

Abstract

Real-time semantic segmentation presents the dual challenge of designing efficient architectures that capture large receptive fields for semantic understanding while also refining detailed contours. Vision transformers model long-range dependencies effectively but incur high computational cost. To address these challenges, we introduce the Large Kernel Attention (LKA) mechanism. Our proposed Bilateral Efficient Visual Attention Network (BEVANet) expands the receptive field to capture contextual information and extracts visual and structural features using Sparse Decomposed Large Separable Kernel Attentions (SDLSKA). The Comprehensive Kernel Selection (CKS) mechanism dynamically adapts the receptive field to further enhance performance. Furthermore, the Deep Large Kernel Pyramid Pooling Module (DLKPPM) enriches contextual features by synergistically combining dilated convolutions and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
maomao0819/BEVANet
model· 6 dl
6 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.