Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Keziah Naggita, Julienne LaChance, Alice Xiang

TL;DR
This study analyzes the geographic and socio-economic diversity of Flickr images in Africa, revealing biases and gaps in data collection that impact the global applicability of computer vision models.
Contribution
It provides a large-scale analysis of geo-diversity in African visual data, highlighting biases and the need for more representative datasets.
Findings
Significant underrepresentation of African images compared to population data.
Presence of 'othering' phenomenon with many images taken by non-locals.
Temporal trends show changes in data availability over time.
Abstract
Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
