A Note on the Inception Score

Shane Barratt; Rishi Sharma

arXiv:1801.01973·stat.ML·June 22, 2018·234 cites

A Note on the Inception Score

Shane Barratt, Rishi Sharma

PDF

Open Access 5 Repos

TL;DR

This paper critically examines the Inception Score, revealing its limitations in evaluating generative models and emphasizing the need for more reliable assessment methods in the field.

Contribution

It provides a detailed analysis of the Inception Score's shortcomings and advocates for more systematic evaluation practices in generative model research.

Findings

01

Inception Score can be misleading when comparing models.

02

The metric has inherent suboptimalities affecting its reliability.

03

Careful evaluation is crucial for meaningful progress in generative modeling.

Abstract

Deep generative models are powerful tools that have produced impressive results in recent years. These advances have been for the most part empirically driven, making it essential that we use high quality evaluation metrics. In this paper, we provide new insights into the Inception Score, a recently proposed and widely used evaluation metric for generative models, and demonstrate that it fails to provide useful guidance when comparing models. We discuss both suboptimalities of the metric itself and issues with its application. Finally, we call for researchers to be more systematic and careful when evaluating and comparing generative models, as the advancement of the field depends upon it.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Data Visualization and Analytics · Anomaly Detection Techniques and Applications