A Randomized Permutation Whole-Model Test Heuristic for Self-Validated Ensemble Models (SVEM)
Andrew T. Karl

TL;DR
This paper presents a novel heuristic for statistically validating Self-Validated Ensemble Models (SVEM) by comparing their predictions against randomized permutations, ensuring significance testing with controlled error rates.
Contribution
It introduces a permutation-based significance test for SVEM models, incorporating a Mahalanobis distance approach and a simulation-driven power analysis to assess model fit.
Findings
The test maintains nominal Type I error rate even with complex models.
Simulation studies demonstrate the test's effectiveness in detecting true model significance.
The approach provides a joint graphical summary for multiple responses.
Abstract
We introduce a heuristic to test the significance of fit of Self-Validated Ensemble Models (SVEM) against the null hypothesis of a constant response. A SVEM model averages predictions from nBoot fits of a model, applied to fractionally weighted bootstraps of the target dataset. It tunes each fit on a validation copy of the training data, utilizing anti-correlated weights for training and validation. The proposed test computes SVEM predictions centered by the response column mean and normalized by the ensemble variability at each of nPoint points spaced throughout the factor space. A reference distribution is constructed by refitting the SVEM model to nPerm randomized permutations of the response column and recording the corresponding standardized predictions at the nPoint points. A reduced-rank singular value decomposition applied to the centered and scaled nPerm x nPoint reference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsBalanced Selection
