AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation
Yansheng Li, Kun Li, Yongjun Zhang, Linlin Wang, Dingwen Zhang

TL;DR
This paper introduces AUG, a new aerial image dataset for scene graph generation, and proposes an efficient locality-preserving graph convolutional network that outperforms existing methods in understanding urban scenes from overhead images.
Contribution
The paper creates the AUG dataset for overhead urban scene graph generation and develops LPG, a novel locality-preserving graph convolutional network for improved scene understanding.
Findings
LPG significantly outperforms state-of-the-art methods on AUG.
The AUG dataset contains over 25,000 objects and 17,000 relationships.
The proposed ABS-PRD method effectively prunes meaningless relationship pairs.
Abstract
Scene graph generation (SGG) aims to understand the visual objects and their semantic relationships from one given image. Until now, lots of SGG datasets with the eyelevel view are released but the SGG dataset with the overhead view is scarcely studied. By contrast to the object occlusion problem in the eyelevel view, which impedes the SGG, the overhead view provides a new perspective that helps to promote the SGG by providing a clear perception of the spatial relationships of objects in the ground scene. To fill in the gap of the overhead view dataset, this paper constructs and releases an aerial image urban scene graph generation (AUG) dataset. Images from the AUG dataset are captured with the low-attitude overhead view. In the AUG dataset, 25,594 objects, 16,970 relationships, and 27,175 attributes are manually annotated. To avoid the local context being overwhelmed in the complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
