ecpc: An R-package for generic co-data models for high-dimensional prediction
Mirrelijn M. van Nee, Lodewyk F.A. Wessels, Mark A. van de Wiel

TL;DR
The paper introduces an extension to the ecpc R-package for high-dimensional prediction, enabling more flexible modeling of continuous co-data to improve variable selection and prediction accuracy.
Contribution
It presents a new approach for modeling continuous co-data within the ecpc package, using a regression framework for empirical Bayes estimation, enhancing flexibility and efficiency.
Findings
Improved modeling of continuous co-data with the new extension.
Enhanced variable selection performance in simulations.
Demonstrated practical applications in real data examples.
Abstract
High-dimensional prediction considers data with more variables than samples. Generic research goals are to find the best predictor or to select variables. Results may be improved by exploiting prior information in the form of co-data, providing complementary data not on the samples, but on the variables. We consider adaptive ridge penalised generalised linear and Cox models, in which the variable specific ridge penalties are adapted to the co-data to give a priori more weight to more important variables. The R-package ecpc originally accommodated various and possibly multiple co-data sources, including categorical co-data, i.e. groups of variables, and continuous co-data. Continuous co-data, however, was handled by adaptive discretisation, potentially inefficiently modelling and losing information. Here, we present an extension to the method and software for generic co-data models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R · Statistical Methods and Inference · Advanced Statistical Methods and Models
MethodsLinear Regression
