Interpretations, Representations, and Stereotypes of Caste within Text-to-Image Generators
Sourojit Ghosh

TL;DR
This paper investigates how text-to-image generators, specifically Stable Diffusion, portray caste in India, revealing biases that favor high-caste representations and marginalize Dalit identities, highlighting the need for caste-aware AI design.
Contribution
It provides a novel analysis of caste representations in T2I models, revealing biases and stereotypes, and offers design recommendations for more equitable AI systems.
Findings
Stable Diffusion equates Indianness with high-caste identities.
Dalit representations are stereotyped as rural and protesting.
Models perpetuate caste-based stereotypes and marginalization.
Abstract
The surge in the popularity of text-to-image generators (T2Is) has been matched by extensive research into ensuring fairness and equitable outcomes, with a focus on how they impact society. However, such work has typically focused on globally-experienced identities or centered Western contexts. In this paper, we address interpretations, representations, and stereotypes surrounding a tragically underexplored context in T2I research: caste. We examine how the T2I Stable Diffusion displays people of various castes, and what professions they are depicted as performing. Generating 100 images per prompt, we perform CLIP-cosine similarity comparisons with default depictions of an 'Indian person' by Stable Diffusion, and explore patterns of similarity. Our findings reveal how Stable Diffusion outputs perpetuate systems of 'castelessness', equating Indianness with high-castes and depicting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsItalian Fascism and Post-war Society
