Inspecting the Geographical Representativeness of Images from Text-to-Image Models
Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi

TL;DR
This study evaluates how well popular text-to-image models like DALL.E 2 and Stable Diffusion represent different geographical regions in their generated images, revealing biases and areas for improvement.
Contribution
It introduces a crowdsourced methodology to measure geographical bias in generated images and analyzes the impact of specifying country names on representativeness.
Findings
Generated images predominantly reflect US and Indian surroundings.
Specifying country names improves geographical representativeness.
Many countries remain underrepresented despite input specifications.
Abstract
Recent progress in generative models has resulted in models that produce both realistic as well as relevant images for most textual inputs. These models are being used to generate millions of images everyday, and hold the potential to drastically impact areas such as generative art, digital marketing and data augmentation. Given their outsized impact, it is important to ensure that the generated content reflects the artifacts and surroundings across the globe, rather than over-representing certain parts of the world. In this paper, we measure the geographical representativeness of common nouns (e.g., a house) generated through DALL.E 2 and Stable Diffusion models using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Inspecting the Geographical Representativeness of Images from Text-to-Image Models· youtube
Taxonomy
TopicsImage Retrieval and Classification Techniques
MethodsDiffusion
