From Drone Imagery to Livability Mapping: AI-powered Environment Perception in Rural China
Weihuan Deng, Yaofu Huang, Luan Chen, Xun Li, Yu Gu, Yao Yao

TL;DR
This paper introduces a novel AI framework using drone imagery and multimodal language models to assess rural livability in China, addressing cost and scalability issues in environmental perception.
Contribution
It develops a systematic methodology combining chain-of-thought prompting, text-constrained comparisons, and an innovative ranking algorithm for large-scale rural livability assessment.
Findings
Achieved a Spearman Footrule distance of 0.74, outperforming commercial models.
Enhanced computational efficiency threefold through concurrent comparison and ranking.
Provided data and methodological breakthroughs for large-scale village livability analysis.
Abstract
The high cost of acquiring rural street view images has constrained comprehensive environmental perception in rural areas. Drone photographs, with their advantages of easy acquisition, broad coverage, and high spatial resolution, offer a viable approach for large-scale rural environmental perception. However, a systematic methodology for identifying key environmental elements from drone photographs and quantifying their impact on environmental perception remains lacking. To address this gap, a Vision-Language Contrastive Ranking Framework (VLCR) is designed for rural livability assessment in China. The framework employs chain-of-thought prompting strategies to guide multimodal large language models (MLLMs) in identifying visual features related to quality of life and ecological habitability from drone photographs. Subsequently, to address the instability in pairwise village comparison,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
