Information Criterion for a Large Scale Subset Regression Models

Genshiro Kitagawa

arXiv:2309.08110·stat.ME·September 18, 2023

Information Criterion for a Large Scale Subset Regression Models

Genshiro Kitagawa

PDF

Open Access

TL;DR

This paper discusses an improved information criterion for large-scale subset regression models, addressing bias issues in variable selection from many candidates, to enhance model accuracy.

Contribution

It introduces a bias correction method for the information criterion tailored to large-scale subset regression, improving variable selection accuracy.

Findings

01

Proposes a new bias correction for the information criterion.

02

Addresses problems in variable selection with many candidate variables.

03

Enhances model selection reliability in large-scale data analysis.

Abstract

The information criterion for determining the number of explanatory variables in a subset regression modeling is discussed. Information criterion such as AIC is effective and frequently used in model selection for ordinary regression models and statistical models. With the recent prosperity of data science, analysis of large-scale data has become important. When constructing models heuristically from a very large number of candidate explanatory variables, there is a possibility of picking up apparent correlations and adopting inappropriate variables. In this paper, we point out the problems specific to subset regression from the viewpoint of bias correction for log-likelihood and present a correction method that takes this into account.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Face and Expression Recognition · Advanced Statistical Methods and Models