Combining the Best of Convolutional Layers and Recurrent Layers: A Hybrid Network for Semantic Segmentation
Zhicheng Yan, Hao Zhang, Yangqing Jia, Thomas Breuel, Yizhou Yu

TL;DR
This paper introduces a hybrid neural network combining convolutional and recurrent layers to enhance semantic segmentation by capturing global context, resulting in improved accuracy on benchmark datasets.
Contribution
The paper proposes a novel hybrid deep ReNet architecture that integrates ReNet layers with FCNs, enabling full-image receptive fields and end-to-end training for better segmentation performance.
Findings
H-ReNet outperforms state-of-the-art methods on PASCAL VOC 2012.
ReNet layers effectively capture global context for segmentation.
H-ReNet achieves higher IoUs on multiple object classes.
Abstract
State-of-the-art results of semantic segmentation are established by Fully Convolutional neural Networks (FCNs). FCNs rely on cascaded convolutional and pooling layers to gradually enlarge the receptive fields of neurons, resulting in an indirect way of modeling the distant contextual dependence. In this work, we advocate the use of spatially recurrent layers (i.e. ReNet layers) which directly capture global contexts and lead to improved feature representations. We demonstrate the effectiveness of ReNet layers by building a Naive deep ReNet (N-ReNet), which achieves competitive performance on Stanford Background dataset. Furthermore, we integrate ReNet layers with FCNs, and develop a novel Hybrid deep ReNet (H-ReNet). It enjoys a few remarkable properties, including full-image receptive fields, end-to-end training, and efficient network execution. On the PASCAL VOC 2012 benchmark, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Machine Learning and Data Classification
