Learning to Evaluate Performance of Multi-modal Semantic Localization

Zhiqiang Yuan; Wenkai Zhang; Chongyang Li; Zhaoying Pan; Yongqiang; Mao; Jialiang Chen; Shouke Li; Hongqi Wang; and Xian Sun

arXiv:2209.06515·cs.CV·September 20, 2022

Learning to Evaluate Performance of Multi-modal Semantic Localization

Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang, Mao, Jialiang Chen, Shouke Li, Hongqi Wang, and Xian Sun

PDF

1 Repo

TL;DR

This paper systematically studies semantic localization in remote sensing, proposing new evaluation metrics, creating a benchmark dataset, and analyzing model performance to advance the understanding and application of multi-modal semantic localization.

Contribution

It introduces comprehensive evaluation metrics, a diverse test dataset, and a detailed benchmark for the semantic localization task in remote sensing, filling a significant research gap.

Findings

01

Proposed new metrics for pixel and region-level evaluation.

02

Created a large-scale, multi-objective test dataset AIR-SLT.

03

Analyzed the impact of variables on model performance.

Abstract

Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text. As an emerging task based on cross-modal retrieval, SeLo achieves semantic-level retrieval with only caption-level annotation, which demonstrates its great potential in unifying downstream tasks. Although SeLo has been carried out successively, but there is currently no work has systematically explores and analyzes this urgent direction. In this paper, we thoroughly study this field and provide a complete benchmark in terms of metrics and testdata to advance the SeLo task. Firstly, based on the characteristics of this task, we propose multiple discriminative evaluation metrics to quantify the performance of the SeLo task. The devised significant area proportion, attention shift distance, and discrete attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaoyuan1996/semanticlocalizationmetrics
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest