Posterior Dispersion Indices

Alp Kucukelbir; David M. Blei

arXiv:1605.07604·stat.ML·May 25, 2016

Posterior Dispersion Indices

Alp Kucukelbir, David M. Blei

PDF

Open Access

TL;DR

This paper introduces posterior dispersion indices (PDI), a new evaluation metric for probabilistic models that assesses how individual data points relate to posterior uncertainty, revealing model mismatches beyond traditional accuracy measures.

Contribution

It proposes a novel family of posterior dispersion indices (PDI) to analyze model fit by examining data points in relation to posterior uncertainty, providing richer insights.

Findings

01

PDIs reveal patterns of model mismatch in real data

02

PDIs outperform traditional metrics in identifying model issues

03

Application to diverse datasets demonstrates broad utility

Abstract

Probabilistic modeling is cyclical: we specify a model, infer its posterior, and evaluate its performance. Evaluation drives the cycle, as we revise our model based on how it performs. This requires a metric. Traditionally, predictive accuracy prevails. Yet, predictive accuracy does not tell the whole story. We propose to evaluate a model through posterior dispersion. The idea is to analyze how each datapoint fares in relation to posterior uncertainty around the hidden structure. We propose a family of posterior dispersion indices (PDI) that capture this idea. A PDI identifies rich patterns of model mismatch in three real data examples: voting preferences, supermarket shopping, and population genetics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Statistical Methods and Bayesian Inference