Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs
Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao

TL;DR
This paper introduces a multi-resolution CNN architecture combined with knowledge-guided disambiguation techniques to improve large-scale scene classification, effectively handling intra-class variations and label ambiguity, achieving state-of-the-art results.
Contribution
It proposes a novel multi-resolution CNN framework and two knowledge-guided disambiguation methods for enhanced scene recognition in large datasets.
Findings
Achieved second place in Places2 challenge (ILSVRC 2015)
Secured first place in LSUN challenge (CVPR 2016)
Set new state-of-the-art on Indoor67 and SUN397 datasets.
Abstract
Convolutional Neural Networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale scene datasets, such as the Places and Places2. Scene categories are often defined by multi-level information, including local objects, global layout, and background environment, thus leading to large intra-class variations. In addition, with the increasing number of scene categories, label ambiguity has become another crucial issue in large-scale classification. This paper focuses on large-scale scene recognition and makes two major contributions to tackle these issues. First, we propose a multi-resolution CNN architecture that captures visual content and structure at multiple levels. The multi-resolution CNNs are composed of coarse resolution CNNs and fine resolution CNNs, which are complementary to each other. Second, we design two knowledge guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
