Change Detection Meets Visual Question Answering
Zhenghang Yuan, Lichao Mou, Zhitong Xiong, Xiaoxiang Zhu

TL;DR
This paper introduces a new task called change detection-based visual question answering (CDVQA) that enables users to query high-level change information from multi-temporal aerial images, supported by a new dataset and baseline framework.
Contribution
The paper presents the first CDVQA dataset, a baseline model with a change enhancing module, and analyzes different fusion strategies for the task.
Findings
The change enhancing module improves change-related information capture.
Different fusion strategies significantly affect CDVQA performance.
The dataset and baseline provide a foundation for future research in CDVQA.
Abstract
The Earth's surface is continually changing, and identifying changes plays an important role in urban planning and sustainability. Although change detection techniques have been successfully developed for many years, these techniques are still limited to experts and facilitators in related fields. In order to provide every user with flexible access to change information and help them better understand land-cover changes, we introduce a novel task: change detection-based visual question answering (CDVQA) on multi-temporal aerial images. In particular, multi-temporal images can be queried to obtain high level change-based information according to content changes between two input images. We first build a CDVQA dataset including multi-temporal image-question-answer triplets using an automatic question-answer generation method. Then, a baseline CDVQA framework is devised in this work, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
