Skill Rating for Generative Models
Catherine Olsson, Surya Bhupatiraju, Tom Brown, Augustus Odena, Ian, Goodfellow

TL;DR
This paper introduces a novel evaluation framework for generative models inspired by competitive game tournaments, using win rates and skill ratings to assess training progress and compare different models effectively.
Contribution
It proposes tournament-based evaluation methods, including win rate and skill rating, providing new tools for monitoring training and comparing generative models.
Findings
Tournament evaluations effectively track training progress.
Skill ratings offer a comparative measure of model capabilities.
Tournament methods complement existing evaluation approaches.
Abstract
We explore a new way to evaluate generative models using insights from evaluation of competitive games between human players. We show experimentally that tournaments between generators and discriminators provide an effective way to evaluate generative models. We introduce two methods for summarizing tournament outcomes: tournament win rate and skill rating. Evaluations are useful in different contexts, including monitoring the progress of a single model as it learns during the training process, and comparing the capabilities of two different fully trained models. We show that a tournament consisting of a single model playing against past and future versions of itself produces a useful measure of training progress. A tournament containing multiple separate models (using different seeds, hyperparameters, and architectures) provides a useful relative comparison between different trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Generative Adversarial Networks and Image Synthesis
