TL;DR
ReSeg is a novel neural network architecture combining convolutional and recurrent layers to improve semantic segmentation accuracy, leveraging global dependencies and local features, and achieving state-of-the-art results on multiple datasets.
Contribution
This paper introduces ReSeg, a new structured prediction model that extends ReNet for semantic segmentation, integrating convolutional and recurrent layers for better global context understanding.
Findings
Achieved state-of-the-art performance on multiple datasets.
Efficiently combines local features with global dependencies.
Flexible architecture suitable for various segmentation tasks.
Abstract
We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in both directions, encoding patches or activations, and providing relevant global information. Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features. Upsampling layers follow ReNet layers to recover the original image resolution in the final predictions. The proposed ReSeg architecture is efficient, flexible and suitable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation· youtube
