Graph-guided Architecture Search for Real-time Semantic Segmentation
Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi

TL;DR
This paper introduces a Graph-guided Architecture Search (GAS) method that automatically designs real-time semantic segmentation networks, balancing accuracy and speed through a novel search space and graph convolution communication.
Contribution
The paper proposes a new search mechanism with a diverse cell-level design and latency constraints, integrating GCN for improved network exploration in semantic segmentation.
Findings
Achieves state-of-the-art accuracy-speed trade-off on Cityscapes.
GAS reaches 73.5% mIoU at 108.4 FPS on Titan Xp.
Outperforms previous methods in real-time segmentation benchmarks.
Abstract
Designing a lightweight semantic segmentation network often requires researchers to find a trade-off between performance and speed, which is always empirical due to the limited interpretability of neural networks. In order to release researchers from these tedious mechanical trials, we propose a Graph-guided Architecture Search (GAS) pipeline to automatically search real-time semantic segmentation networks. Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint. Specifically, to produce the cell-level diversity, the cell-sharing constraint is eliminated through the cell-independent manner. Then a graph convolution network (GCN) is seamlessly integrated as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Graph-Guided Architecture Search for Real-Time Semantic Segmentation· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Interpretability · Convolution
