Platonic Representations for Poverty Mapping: Unified Vision-Language Codes or Agent-Induced Novelty?
Satiyabooshan Murugaboopathy, Connor T. Jerzak, Adel Daoud

TL;DR
This study explores how satellite images and web-sourced text, combined with language models, can effectively predict household wealth, revealing shared representations and the potential of multimodal data for socio-economic mapping.
Contribution
It introduces a multimodal framework that fuses vision and language data for wealth prediction and provides evidence of shared latent representations across modalities.
Findings
Fusing vision and language improves wealth prediction accuracy.
Fused embeddings show moderate correlation, indicating shared latent codes.
LLM-generated text outperforms agent-retrieved data in representing socio-economic information.
Abstract
We investigate whether socio-economic indicators like household wealth leave recoverable imprints in satellite imagery (capturing physical features) and Internet-sourced text (reflecting historical/economic narratives). Using Demographic and Health Survey (DHS) data from African neighborhoods, we pair Landsat images with LLM-generated textual descriptions conditioned on location/year and text retrieved by an AI search agent from web sources. We develop a multimodal framework predicting household wealth (International Wealth Index) through five pipelines: (i) vision model on satellite images, (ii) LLM using only location/year, (iii) AI agent searching/synthesizing web text, (iv) joint image-text encoder, (v) ensemble of all signals. Our framework yields three contributions. First, fusing vision and agent/LLM text outperforms vision-only baselines in wealth prediction (e.g., R-squared of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Computational and Text Analysis Methods · Human Mobility and Location-Based Analysis
