Penalised regression with multiple sources of prior effects
Armin Rauschenberger, Zied Landoulsi, Mark A. van de Wiel, Enrico, Glaab

TL;DR
This paper introduces a penalised regression method that integrates multiple sources of prior information about feature effects, improving prediction accuracy in high-dimensional tasks, and is implemented in the R package `transreg'.
Contribution
It presents a novel approach for combining multiple prior data sources into penalised regression models, enhancing predictive performance.
Findings
Improved prediction accuracy with prior information integration
Method validated through simulations and real data applications
Available as an R package `transreg' for practical use
Abstract
In many high-dimensional prediction or classification tasks, complementary data on the features are available, e.g. prior biological knowledge on (epi)genetic markers. Here we consider tasks with numerical prior information that provide an insight into the importance (weight) and the direction (sign) of the feature effects, e.g. regression coefficients from previous studies. We propose an approach for integrating multiple sources of such prior information into penalised regression. If suitable co-data are available, this improves the predictive performance, as shown by simulation and application. The proposed method is implemented in the R package `transreg' (https://github.com/lcsb-bds/transreg).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Genetic and phenotypic traits in livestock · Gene expression and cancer classification
MethodsLogistic Regression · Linear Regression
