Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries
Emanuele Mezzi, Gertjan Burghouts, Maarten Kruithof

TL;DR
This paper presents RUNE, a neurosymbolic reasoning approach that enhances remote sensing text-to-image retrieval by explicitly reasoning over complex queries, improving interpretability, robustness, and performance over existing vision-language models.
Contribution
Introduction of RUNE, a neurosymbolic inference framework combining LLMs and logic reasoning for improved complex query handling in remote sensing image retrieval.
Findings
RUNE outperforms state-of-the-art models on complex retrieval tasks.
Proposed metrics RRQC and RRIU effectively evaluate robustness.
Logic decomposition improves scalability and efficiency.
Abstract
Text-to-image retrieval in remote sensing (RS) has advanced rapidly with the rise of large vision-language models (LVLMs) tailored for aerial and satellite imagery, culminating in remote sensing large vision-language models (RS-LVLMS). However, limited explainability and poor handling of complex spatial relations remain key challenges for real-world use. To address these issues, we introduce RUNE (Reasoning Using Neurosymbolic Entities), an approach that combines Large Language Models (LLMs) with neurosymbolic AI to retrieve images by reasoning over the compatibility between detected entities and First-Order Logic (FOL) expressions derived from text queries. Unlike RS-LVLMs that rely on implicit joint embeddings, RUNE performs explicit reasoning, enhancing performance and interpretability. For scalability, we propose a logic decomposition strategy that operates on conditioned subsets of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Remote-Sensing Image Classification · Advanced Neural Network Applications
