Playing for Data: Ground Truth from Computer Games
Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun

TL;DR
This paper introduces a method to generate dense pixel-level semantic labels for images from commercial computer games, significantly reducing the need for manual annotation and improving semantic segmentation models.
Contribution
The authors develop a technique to reconstruct semantic labels from game graphics hardware communication, enabling rapid creation of large annotated datasets without source code access.
Findings
Generated labels for 25,000 images from a photorealistic game.
Supplementing real datasets with game data improves segmentation accuracy.
Using game data with one-third of CamVid's labels outperforms full CamVid training.
Abstract
Recent progress in computer vision has been driven by high-capacity models trained on large datasets. Unfortunately, creating large datasets with pixel-level labels has been extremely costly due to the amount of human effort required. In this paper, we present an approach to rapidly creating pixel-accurate semantic label maps for images extracted from modern computer games. Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. We validate the presented approach by producing dense pixel-level semantic annotations for 25 thousand images synthesized by a photorealistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis
