Volumes of logistic regression models with applications to model selection
James G. Dowty

TL;DR
This paper investigates the geometric properties of logistic regression models, establishing bounds on their Fisher information volumes, which serve as a measure of model complexity, and explores implications for model selection and sparsity.
Contribution
It introduces bounds on Fisher information volumes for logistic regression, generalizes classical theorems, and reveals how volume discontinuities influence model sparsity preferences.
Findings
Fisher information volume is bounded between π^q and (n choose q)π^q.
Volume is a continuous function of the design matrix at generic points.
Models with sparse design matrices can be significantly less complex, favoring sparsity.
Abstract
Logistic regression models with observations and linearly-independent covariates are shown to have Fisher information volumes which are bounded below by and above by . This is proved with a novel generalization of the classical theorems of Pythagoras and de Gua, which is of independent interest. The finding that the volume is always finite is new, and it implies that the volume can be directly interpreted as a measure of model complexity. The volume is shown to be a continuous function of the design matrix at generic , but to be discontinuous in general. This means that models with sparse design matrices can be significantly less complex than nearby models, so the resulting model-selection criterion prefers sparse models. This is analogous to the way that -regularisation tends to prefer sparse model fits, though in our case this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis · Statistical Methods and Inference · Advanced Statistical Methods and Models
MethodsLogistic Regression
