TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter
Yiqun Chen, James Zou

TL;DR
This paper introduces TWIGMA, a large dataset of over 800,000 AI-generated images from Twitter with metadata, revealing insights into their characteristics, evolution, and user engagement over time.
Contribution
The paper presents TWIGMA, a novel comprehensive dataset of AI-generated images with metadata, enabling new research on generative AI content and user interactions.
Findings
Gen-AI images have distinctive features compared to natural images.
Similarity to natural images inversely correlates with likes.
Themes of AI-generated images have shifted towards more artistic content.
Abstract
Recent progress in generative artificial intelligence (gen-AI) has enabled the generation of photo-realistic and artistically-inspiring photos at a single click, catering to millions of users online. To explore how people use gen-AI models such as DALLE and StableDiffusion, it is critical to understand the themes, contents, and variations present in the AI-generated photos. In this work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a comprehensive dataset encompassing over 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text, creation date, number of likes), available at https://zenodo.org/records/8031785. Through a comparative analysis of TWIGMA with natural images and human artwork, we find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAesthetic Perception and Analysis · Generative Adversarial Networks and Image Synthesis
