GameLabel-10K: Collecting Image Preference Data Through Mobile Game Crowdsourcing
Jonathan Zhou

TL;DR
This paper presents GameLabel-10K, a novel dataset of image preferences collected via a mobile game, demonstrating that video game players can effectively generate high-quality data for model fine-tuning.
Contribution
It introduces a new data collection method using mobile game players for image preference data, resulting in a sizable dataset and improved model performance.
Findings
Successful collection of nearly 10,000 labels from game players
Enhanced prompt adherence in fine-tuned models
Public release of dataset and model for community use
Abstract
The rise of multi-billion parameter models has sparked an intense hunger for data across deep learning. This study explores the possibility of replacing paid annotators with video game players who are rewarded with in-game currency for good performance. We collaborate with the developers of a mobile historical strategy game, Armchair Commander, to test this idea. More specifically, the current study tests this idea using pairwise image preference data, typically used to fine-tune diffusion models. Using this method, we create GameLabel-10K, a dataset with slightly under 10 thousand labels and 7000 unique prompts. We fine-tune a model on this dataset, we fine-tune Flux Schnell and find an improvement in its prompt adherence, demonstrating the validity of our collection method. In addition, we publicly release both the dataset and our fine-tuned model on Hugging Face.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Educational Games and Gamification · Data Management and Algorithms
MethodsDiffusion
