Central limit theorem related to MDR-method
Alexander Bulinski

TL;DR
This paper extends the MDR-method for high-dimensional biological data by introducing regularized estimates and proving a multidimensional CLT, enhancing the statistical understanding of factor significance in binary response models.
Contribution
It introduces regularized versions of prediction error estimates in the MDR-method and establishes their multidimensional CLT, advancing statistical inference in high-dimensional settings.
Findings
Proved multidimensional CLT for regularized estimates
Discussed self-normalization variants of the CLT
Established strong consistency conditions for estimates
Abstract
In many medical and biological investigations, including genetics, it is typical to handle high dimensional data which can be viewed as a set of values of some factors and a binary response variable. For instance, the response variable can describe the state of a patient health and one often assumes that it depends only on some part of factors. An important problem is to determine collections of significant factors. In this regard we turn to the MDR-method introduced by M.Ritchie and coauthors. Our recent paper provided the necessary and sufficient conditions for strong consistency of estimates of the prediction error employing the K-fold cross-validation and an arbitrary penalty function. Here we introduce the regularized versions of the mentioned estimates and prove for them the multidimensional CLT. Statistical variants of the CLT involving self-normalization are discussed as well.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Neural Networks and Applications
