PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers
Jiacong Xu, Zixiang Xiong, Shankar P. Bhattacharyya

TL;DR
PIDNet introduces a three-branch architecture inspired by PID controllers to improve real-time semantic segmentation, effectively balancing accuracy and speed by better integrating detailed, contextual, and boundary information.
Contribution
The paper proposes PIDNet, a novel three-branch network that addresses overshoot issues in two-branch models by incorporating boundary attention, achieving superior accuracy-speed trade-offs.
Findings
PIDNet surpasses existing models in accuracy on Cityscapes and CamVid datasets.
PIDNet achieves real-time inference speeds of over 93 FPS on Cityscapes.
The boundary attention mechanism effectively guides feature fusion.
Abstract
Two-branch network architecture has shown its efficiency and effectiveness in real-time semantic segmentation tasks. However, direct fusion of high-resolution details and low-frequency context has the drawback of detailed features being easily overwhelmed by surrounding contextual information. This overshoot phenomenon limits the improvement of the segmentation accuracy of existing two-branch models. In this paper, we make a connection between Convolutional Neural Networks (CNN) and Proportional-Integral-Derivative (PID) controllers and reveal that a two-branch network is equivalent to a Proportional-Integral (PI) controller, which inherently suffers from similar overshoot issues. To alleviate this problem, we propose a novel three-branch network architecture: PIDNet, which contains three branches to parse detailed, context and boundary information, respectively, and employs boundary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Neural Network Applications · Brain Tumor Detection and Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
