Understand Scene Categories by Objects: A Semantic Regularized Scene Classifier Using Convolutional Neural Networks
Yiyi Liao, Sarath Kodagoda, Yue Wang, Lei Shi, Yong Liu

TL;DR
This paper introduces a deep learning-based scene classifier that leverages object-level semantic regularization to improve accuracy in scene understanding, especially with limited training data, demonstrated on robotics applications.
Contribution
The novel approach integrates semantic segmentation as a regularizer in deep neural networks for scene classification, achieving superior results with fewer training samples.
Findings
Outperforms state-of-the-art on SUN RGB-D dataset
Achieves new record in semantic segmentation performance
Demonstrates effective generalization to robot-captured images
Abstract
Scene classification is a fundamental perception task for environmental understanding in today's robotics. In this paper, we have attempted to exploit the use of popular machine learning technique of deep learning to enhance scene understanding, particularly in robotics applications. As scene images have larger diversity than the iconic object images, it is more challenging for deep learning methods to automatically learn features from scene images with less samples. Inspired by human scene understanding based on object knowledge, we address the problem of scene classification by encouraging deep neural networks to incorporate object-level information. This is implemented with a regularization of semantic segmentation. With only 5 thousand training images, as opposed to 2.5 million images, we show the proposed deep architecture achieves superior scene classification results to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications
