BiSeNet: Bilateral Segmentation Network for Real-time Semantic   Segmentation

Changqian Yu; Jingbo Wang; Chao Peng; Changxin Gao; Gang Yu; Nong Sang

arXiv:1808.00897·cs.CV·August 3, 2018·126 cites

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang

PDF

Open Access 5 Repos

TL;DR

BiSeNet is a real-time semantic segmentation network that balances high-resolution spatial features and large receptive fields, achieving high accuracy and speed on multiple datasets.

Contribution

Introduces a novel Bilateral Segmentation Network combining a Spatial Path and a Context Path with a Feature Fusion Module for efficient real-time segmentation.

Findings

01

Achieves 68.4% Mean IOU on Cityscapes with 105 FPS

02

Balances speed and accuracy better than existing methods

03

Effective on Cityscapes, CamVid, and COCO-Stuff datasets

Abstract

Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Average Pooling · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization