HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation
Guoan Xu, Wenjing Jia, Tao Wu, Ligeng Chen, and Guangwei Gao

TL;DR
HAFormer is a lightweight semantic segmentation model that effectively combines hierarchical CNN features with Transformer-based global context modeling, achieving high accuracy and speed with minimal computational cost.
Contribution
The paper introduces HAFormer, integrating a Hierarchy-Aware Pixel-Excitation module, an Efficient Transformer, and a correlation-weighted Fusion module for improved lightweight segmentation.
Findings
Achieves 74.2% mIoU on Cityscapes with high frame rate
Outperforms existing lightweight models in accuracy and efficiency
Maintains low computational overhead with compact model size
Abstract
Both Convolutional Neural Networks (CNNs) and Transformers have shown great success in semantic segmentation tasks. Efforts have been made to integrate CNNs with Transformer models to capture both local and global context interactions. However, there is still room for enhancement, particularly when considering constraints on computational resources. In this paper, we introduce HAFormer, a model that combines the hierarchical features extraction ability of CNNs with the global dependency modeling capability of Transformers to tackle lightweight semantic segmentation challenges. Specifically, we design a Hierarchy-Aware Pixel-Excitation (HAPE) module for adaptive multi-scale local feature extraction. During the global perception modeling, we devise an Efficient Transformer (ET) module streamlining the quadratic calculations associated with traditional Transformers. Moreover, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Dropout
