Aerial Scene Understanding in The Wild: Multi-Scene Recognition via Prototype-based Memory Networks
Yuansheng Hua, Lichao Moua, Jianzhe Lin, Konrad Heidler, Xiao Xiang, Zhu

TL;DR
This paper introduces a prototype-based memory network for multi-scene recognition in aerial images, leveraging single-scene datasets and minimal multi-scene annotations, and presents a new multi-scene aerial image dataset.
Contribution
It proposes a novel memory network architecture for multi-scene aerial image recognition and creates a new dataset to facilitate research in this area.
Findings
Effective multi-scene recognition demonstrated on variant datasets.
The approach requires only limited multi-scene annotations for training.
The new dataset supports further research in aerial scene understanding.
Abstract
Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes in a single image. Therefore, in this paper, we propose to take a step forward to a more practical and challenging task, namely multi-scene recognition in single images. Moreover, we note that manually yielding annotations for such a task is extraordinarily time- and labor-consuming. To address this, we propose a prototype-based memory network to recognize multiple scenes in a single image by leveraging massive well-annotated single-scene images. The proposed network consists of three key components: 1) a prototype learning module, 2) a prototype-inhabiting external memory, and 3) a multi-head…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Domain Adaptation and Few-Shot Learning
MethodsMemory Network
