A Spatial Layout and Scale Invariant Feature Representation for Indoor   Scene Classification

Munawar Hayat; Salman H. Khan; Mohammed Bennamoun; Senjian An

arXiv:1506.05532·cs.CV·November 3, 2016

A Spatial Layout and Scale Invariant Feature Representation for Indoor Scene Classification

Munawar Hayat, Salman H. Khan, Mohammed Bennamoun, Senjian An

PDF

TL;DR

This paper proposes a novel convolutional neural network architecture with a spatially unstructured layer and pyramidal representation to achieve robust indoor scene classification despite spatial layout and scale variations.

Contribution

It introduces a new learnable feature descriptor and a specialized CNN architecture for improved indoor scene classification under challenging deformations.

Findings

01

Achieves up to 11.9% performance improvement on benchmark datasets.

02

Introduces a spatially unstructured layer for robustness to layout changes.

03

Employs pyramidal image representation for scale invariance.

Abstract

Unlike standard object classification, where the image to be classified contains one or multiple instances of the same object, indoor scene classification is quite different since the image consists of multiple distinct objects. Further, these objects can be of varying sizes and are present across numerous spatial locations in different layouts. For automatic indoor scene categorization, large scale spatial layout deformations and scale variations are therefore two major challenges and the design of rich feature descriptors which are robust to these challenges is still an open problem. This paper introduces a new learnable feature descriptor called "spatial layout and scale invariant convolutional activations" to deal with these challenges. For this purpose, a new Convolutional Neural Network architecture is designed which incorporates a novel 'Spatially Unstructured' layer to introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.