GameLabel-10K: Collecting Image Preference Data Through Mobile Game   Crowdsourcing

Jonathan Zhou

arXiv:2409.19830·cs.CV·October 25, 2024

GameLabel-10K: Collecting Image Preference Data Through Mobile Game Crowdsourcing

Jonathan Zhou

PDF

Open Access 1 Models 1 Datasets

TL;DR

This paper presents GameLabel-10K, a novel dataset of image preferences collected via a mobile game, demonstrating that video game players can effectively generate high-quality data for model fine-tuning.

Contribution

It introduces a new data collection method using mobile game players for image preference data, resulting in a sizable dataset and improved model performance.

Findings

01

Successful collection of nearly 10,000 labels from game players

02

Enhanced prompt adherence in fine-tuned models

03

Public release of dataset and model for community use

Abstract

The rise of multi-billion parameter models has sparked an intense hunger for data across deep learning. This study explores the possibility of replacing paid annotators with video game players who are rewarded with in-game currency for good performance. We collaborate with the developers of a mobile historical strategy game, Armchair Commander, to test this idea. More specifically, the current study tests this idea using pairwise image preference data, typically used to fine-tune diffusion models. Using this method, we create GameLabel-10K, a dataset with slightly under 10 thousand labels and 7000 unique prompts. We fine-tune a model on this dataset, we fine-tune Flux Schnell and find an improvement in its prompt adherence, demonstrating the validity of our collection method. In addition, we publicly release both the dataset and our fine-tuned model on Hugging Face.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Jonathan-Zhou/Flux-GameLabel-Lora
model

Datasets

Jonathan-Zhou/GameLabel-10k
dataset· 15 dl
15 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Educational Games and Gamification · Data Management and Algorithms

MethodsDiffusion