The Five-Dollar Model: Generating Game Maps and Sprites from Sentence Embeddings
Timothy Merino, Roman Negri, Dipika Rajesh, M Charity, Julian Togelius

TL;DR
The paper introduces the five-dollar model, a lightweight text-to-image generator that creates semantically meaningful images from small datasets, specifically for pixel art, sprites, and emojis, with improved performance through novel augmentation.
Contribution
It presents a new low-dimensional, lightweight generative architecture capable of producing meaningful images from limited data and introduces augmentation strategies to enhance its performance.
Findings
Successfully generates accurate pixel art, sprites, and emojis from text.
Maintains semantic meaning despite small datasets and model size.
Uses cosine similarity with CLIP for evaluation.
Abstract
The five-dollar model is a lightweight text-to-image generative architecture that generates low dimensional images from an encoded text prompt. This model can successfully generate accurate and aesthetically pleasing content in low dimensional domains, with limited amounts of training data. Despite the small size of both the model and datasets, the generated images are still able to maintain the encoded semantic meaning of the textual prompt. We apply this model to three small datasets: pixel art video game maps, video game sprite images, and down-scaled emoji images and apply novel augmentation strategies to improve the performance of our model on these limited datasets. We evaluate our models performance using cosine similarity score between text-image pairs generated by the CLIP VIT-B/32 model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Image Retrieval and Classification Techniques
MethodsContrastive Language-Image Pre-training
