Large-Scale Model Selection with Misspecification

Emre Demirkaya; Yang Feng; Pallavi Basu; Jinchi Lv

arXiv:1803.07418·stat.ME·March 21, 2018

Large-Scale Model Selection with Misspecification

Emre Demirkaya, Yang Feng, Pallavi Basu, Jinchi Lv

PDF

Open Access

TL;DR

This paper develops a new high-dimensional generalized Bayesian information criterion, HGBIC_p, for model selection in the presence of misspecification and high dimensionality, ensuring consistency and interpretability.

Contribution

It introduces HGBIC_p, a novel information criterion that accounts for model misspecification and high dimensionality, with proven consistency in ultra-high-dimensional settings.

Findings

01

HGBIC_p effectively balances model fit and complexity.

02

The method demonstrates consistency in ultra-high-dimensional scenarios.

03

Numerical studies confirm the advantages of the proposed criterion.

Abstract

Model selection is crucial to high-dimensional learning and inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work assumes implicitly that the models are correctly specified or have fixed dimensionality. Yet both features of model misspecification and high dimensionality are prevalent in practice. In this paper, we exploit the framework of model selection principles in misspecified models originated in Lv and Liu (2014) and investigate the asymptotic expansion of Bayesian principle of model selection in the setting of high-dimensional misspecified models. With a natural choice of prior probabilities that encourages interpretability and incorporates Kullback-Leibler divergence, we suggest the high-dimensional generalized Bayesian information criterion with prior probability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models

MethodsInterpretability