Joint Level Generation and Translation Using Gameplay Videos
Negar Mirgati, Matthew Guzdial

TL;DR
This paper introduces a multi-task framework that simultaneously translates gameplay videos into level representations and generates new level segments, addressing data scarcity in procedural content generation.
Contribution
It presents a novel multi-tail framework that jointly learns level translation and generation from gameplay videos, reducing the need for laborious secondary representations.
Findings
Joint framework improves both translation and generation performance.
Combining tasks enhances overall results compared to baselines.
Potential to generalize to unseen games in future work.
Abstract
Procedural Content Generation via Machine Learning (PCGML) faces a significant hurdle that sets it apart from other fields, such as image or text generation, which is limited annotated data. Many existing methods for procedural level generation via machine learning require a secondary representation besides level images. However, the current methods for obtaining such representations are laborious and time-consuming, which contributes to this problem. In this work, we aim to address this problem by utilizing gameplay videos of two human-annotated games to develop a novel multi-tail framework that learns to perform simultaneous level translation and generation. The translation tail of our framework can convert gameplay video frames to an equivalent secondary representation, while its generation tail can produce novel level segments. Evaluation results and comparisons between our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Human Motion and Animation
