From YOLO to VLMs: Advancing Zero-Shot and Few-Shot Detection of Wastewater Treatment Plants Using Satellite Imagery in MENA Region
Akila Premarathna, Kanishka Hewageegana, Garcia Andarcia Mariangel

TL;DR
This paper demonstrates that vision-language models can effectively identify wastewater treatment plants in satellite images, often outperforming traditional YOLOv8 methods, with zero-shot capabilities enabling scalable environmental monitoring in the MENA region.
Contribution
It introduces a structured methodology for comparing VLMs to YOLOv8 for WWTP detection, highlighting the potential of zero-shot VLMs as annotation-free alternatives.
Findings
VLMs, especially Gemma-3, outperform YOLOv8 in zero-shot WWTP detection.
Zero-shot VLMs achieve high true positive rates without manual annotation.
VLMs enable scalable, efficient remote sensing for environmental monitoring.
Abstract
In regions of the Middle East and North Africa (MENA), there is a high demand for wastewater treatment plants (WWTPs), crucial for sustainable water management. Precise identification of WWTPs from satellite images enables environmental monitoring. Traditional methods like YOLOv8 segmentation require extensive manual labeling. But studies indicate that vision-language models (VLMs) are an efficient alternative to achieving equivalent or superior results through inherent reasoning and annotation. This study presents a structured methodology for VLM comparison, divided into zero-shot and few-shot streams specifically to identify WWTPs. The YOLOv8 was trained on a governmental dataset of 83,566 high-resolution satellite images from Egypt, Saudi Arabia, and UAE: ~85% WWTPs (positives), 15% non-WWTPs (negatives). Evaluated VLMs include LLaMA 3.2 Vision, Qwen 2.5 VL, DeepSeek-VL2, Gemma 3,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Marine and coastal ecosystems · Flood Risk Assessment and Management
