Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models
Maria-Teresa De Rosa Palmini, Laura Wagner, Eva Cetinic

TL;DR
This paper introduces the Civiverse dataset for analyzing user engagement with open-source text-to-image models, revealing preferences for explicit content and semantic homogenization, highlighting societal issues in AI-generated visuals.
Contribution
It provides the first systematic cultural analysis of open-source TTI platforms using a large-scale prompt dataset, revealing user behaviors and societal biases.
Findings
Predominance of explicit content generation
Semantic homogenization of prompts
Potential reinforcement of stereotypes
Abstract
Text-to-image (TTI) systems, particularly those utilizing open-source frameworks, have become increasingly prevalent in the production of Artificial Intelligence (AI)-generated visuals. While existing literature has explored various problematic aspects of TTI technologies, such as bias in generated content, intellectual property concerns, and the reinforcement of harmful stereotypes, open-source TTI frameworks have not yet been systematically examined from a cultural perspective. This study addresses this gap by analyzing the CivitAI platform, a leading open-source platform dedicated to TTI AI. We introduce the Civiverse prompt dataset, encompassing millions of images and related metadata. We focus on prompt analysis, specifically examining the semantic characteristics of text prompts, as it is crucial for addressing societal issues related to generative technologies. This analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinTech, Crowdfunding, Digital Finance
MethodsFocus
