Recurrent Convolutional Neural Networks for Scene Parsing

Pedro H. O. Pinheiro; Ronan Collobert

arXiv:1306.2795·cs.CV·June 13, 2013·75 cites

Recurrent Convolutional Neural Networks for Scene Parsing

Pedro H. O. Pinheiro, Ronan Collobert

PDF

Open Access

TL;DR

This paper introduces a recurrent convolutional neural network for scene parsing that captures long-range dependencies efficiently, achieves state-of-the-art results without segmentation or task-specific features, and maintains fast inference.

Contribution

The proposed recurrent CNN effectively models large context in scene parsing without relying on segmentation or task-specific features, enabling end-to-end training and improved accuracy.

Findings

01

State-of-the-art performance on Stanford Background Dataset

02

State-of-the-art performance on SIFT Flow Dataset

03

Fast inference at test time

Abstract

Scene parsing is a technique that consist on giving a label to all pixels in an image according to the class they belong to. To ensure a good visual coherence and a high class accuracy, it is essential for a scene parser to capture image long range dependencies. In a feed-forward architecture, this can be simply achieved by considering a sufficiently large input context patch, around each pixel to be labeled. We propose an approach consisting of a recurrent convolutional neural network which allows us to consider a large input context, while limiting the capacity of the model. Contrary to most standard approaches, our method does not rely on any segmentation methods, nor any task-specific features. The system is trained in an end-to-end manner over raw pixels, and models complex spatial dependencies with low inference cost. As the context size increases with the built-in recurrence, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Advanced Neural Network Applications