Vision-Language Agents for Interactive Forest Change Analysis
James Brock, Ce Zhang, Nantheera Anantrasirichai

TL;DR
This paper presents an LLM-driven vision-language system for interactive forest change analysis, integrating change detection and semantic captioning to enhance interpretability and accessibility of satellite imagery data.
Contribution
It introduces a novel multi-level change interpretation model and the Forest-Change dataset for improved remote sensing image change analysis.
Findings
Achieved 67.10% mIoU and 40.17 BLEU-4 scores on Forest-Change dataset.
Achieved 88.13% mIoU and 34.41 BLEU-4 scores on LEVIR-MCI-Trees.
Demonstrated the system's potential to improve interpretability and efficiency in forest change analysis.
Abstract
Modern forest monitoring workflows increasingly benefit from the growing availability of high-resolution satellite imagery and advances in deep learning. Two persistent challenges in this context are accurate pixel-level change detection and meaningful semantic change captioning for complex forest dynamics. While large language models (LLMs) are being adapted for interactive data exploration, their integration with vision-language models (VLMs) for remote sensing image change interpretation (RSICI) remains underexplored. To address this gap, we introduce an LLM-driven agent for integrated forest change analysis that supports natural language querying across multiple RSICI tasks. The proposed system builds upon a multi-level change interpretation (MCI) vision-language backbone with LLM-based orchestration. To facilitate adaptation and evaluation in forest environments, we further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
