PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model
Juncai Peng, Yi Liu, Shiyu Tang, Yuying Hao, Lutao Chu, Guowei Chen,, Zewu Wu, Zeyu Chen, Zhiliang Yu, Yuning Du, Qingqing Dang, Baohua Lai, Qiwen, Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma

TL;DR
PP-LiteSeg is a new lightweight real-time semantic segmentation model that balances high accuracy with fast inference speed through innovative modules like FLD, UAFM, and SPPM.
Contribution
The paper introduces PP-LiteSeg, featuring a novel lightweight decoder, a unified attention fusion module, and a simple pyramid pooling module for efficient global context aggregation.
Findings
Achieves 72.0% mIoU at 273.6 FPS on Cityscapes
Outperforms previous real-time segmentation methods in accuracy-speed trade-off
Demonstrates effectiveness of proposed modules in reducing computation
Abstract
Real-world applications have high demands for semantic segmentation methods. Although semantic segmentation has made remarkable leap-forwards with deep learning, the performance of real-time methods is not satisfactory. In this work, we propose PP-LiteSeg, a novel lightweight model for the real-time semantic segmentation task. Specifically, we present a Flexible and Lightweight Decoder (FLD) to reduce computation overhead of previous decoder. To strengthen feature representations, we propose a Unified Attention Fusion Module (UAFM), which takes advantage of spatial and channel attention to produce a weight and then fuses the input features with the weight. Moreover, a Simple Pyramid Pooling Module (SPPM) is proposed to aggregate global context with low computation cost. Extensive evaluations demonstrate that PP-LiteSeg achieves a superior trade-off between accuracy and speed compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Batch Normalization · Average Pooling · Pyramid Pooling Module
