Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images
Sungjune Park, Hyunjun Kim, Beomchan Park, Yong Man Ro

TL;DR
This paper introduces LANGO, a novel language-guided object detection framework for aerial images that addresses scene and instance-level variations, improving detection accuracy through semantic reasoning and relation learning.
Contribution
The paper proposes a new language-guided learning framework with a visual semantic reasoner and relation learning loss to handle variations in aerial image object detection.
Findings
Significant improvement in detection performance on aerial image datasets.
Effective handling of viewpoint, scale, and environmental variations.
Enhanced understanding of scene semantics improves localization accuracy.
Abstract
Despite recent advancements in computer vision research, object detection in aerial images still suffers from several challenges. One primary challenge to be mitigated is the presence of multiple types of variation in aerial images, for example, illumination and viewpoint changes. These variations result in highly diverse image scenes and drastic alterations in object appearance, so that it becomes more complicated to localize objects from the whole image scene and recognize their categories. To address this problem, in this paper, we introduce a novel object detection framework in aerial images, named LANGuage-guided Object detection (LANGO). Upon the proposed language-guided learning, the proposed framework is designed to alleviate the impacts from both scene and instance-level variations. First, we are motivated by the way humans understand the semantics of scenes while perceiving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
