Better prediction by use of co-data: Adaptive group-regularized ridge regression
Mark A. van de Wiel, Tonje G. Lien, Wina Verlaat, Wessel N. van, Wieringen, Saskia M. Wilting

TL;DR
This paper introduces GRridge, an adaptive group-regularized ridge regression method that leverages co-data to improve prediction accuracy in high-dimensional genomic studies, with efficient empirical Bayes estimation and enhanced variable selection.
Contribution
The paper presents a novel adaptive group-regularized ridge regression approach that uses co-data for improved prediction and variable selection, requiring only one global penalty parameter.
Findings
GRridge outperforms logistic ridge regression and group lasso in cancer genomics data.
The method enhances variable selection by distinguishing near-zero and large coefficients.
Predictive performance remains strong with a reduced set of 42 variables.
Abstract
For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We derive empirical Bayes estimates of group-specific penalties, which possess several nice properties: i) they are analytical; ii) they adapt to the informativeness of the co-data for the data at hand; iii) only one global penalty parameter requires tuning by cross-validation. In addition, the method allows use of multiple types of co-data at little extra computational effort. We show that the group-specific penalties may lead to a larger distinction between `near-zero' and relatively large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Statistical Methods and Inference
