A Spatial Layout and Scale Invariant Feature Representation for Indoor Scene Classification
Munawar Hayat, Salman H. Khan, Mohammed Bennamoun, Senjian An

TL;DR
This paper proposes a novel convolutional neural network architecture with a spatially unstructured layer and pyramidal representation to achieve robust indoor scene classification despite spatial layout and scale variations.
Contribution
It introduces a new learnable feature descriptor and a specialized CNN architecture for improved indoor scene classification under challenging deformations.
Findings
Achieves up to 11.9% performance improvement on benchmark datasets.
Introduces a spatially unstructured layer for robustness to layout changes.
Employs pyramidal image representation for scale invariance.
Abstract
Unlike standard object classification, where the image to be classified contains one or multiple instances of the same object, indoor scene classification is quite different since the image consists of multiple distinct objects. Further, these objects can be of varying sizes and are present across numerous spatial locations in different layouts. For automatic indoor scene categorization, large scale spatial layout deformations and scale variations are therefore two major challenges and the design of rich feature descriptors which are robust to these challenges is still an open problem. This paper introduces a new learnable feature descriptor called "spatial layout and scale invariant convolutional activations" to deal with these challenges. For this purpose, a new Convolutional Neural Network architecture is designed which incorporates a novel 'Spatially Unstructured' layer to introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
