Penalised regression with multiple sources of prior effects

Armin Rauschenberger; Zied Landoulsi; Mark A. van de Wiel; Enrico; Glaab

arXiv:2212.08581·stat.ME·December 19, 2022

Penalised regression with multiple sources of prior effects

Armin Rauschenberger, Zied Landoulsi, Mark A. van de Wiel, Enrico, Glaab

PDF

Open Access 3 Repos

TL;DR

This paper introduces a penalised regression method that integrates multiple sources of prior information about feature effects, improving prediction accuracy in high-dimensional tasks, and is implemented in the R package `transreg'.

Contribution

It presents a novel approach for combining multiple prior data sources into penalised regression models, enhancing predictive performance.

Findings

01

Improved prediction accuracy with prior information integration

02

Method validated through simulations and real data applications

03

Available as an R package `transreg' for practical use

Abstract

In many high-dimensional prediction or classification tasks, complementary data on the features are available, e.g. prior biological knowledge on (epi)genetic markers. Here we consider tasks with numerical prior information that provide an insight into the importance (weight) and the direction (sign) of the feature effects, e.g. regression coefficients from previous studies. We propose an approach for integrating multiple sources of such prior information into penalised regression. If suitable co-data are available, this improves the predictive performance, as shown by simulation and application. The proposed method is implemented in the R package `transreg' (https://github.com/lcsb-bds/transreg).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Genetic and phenotypic traits in livestock · Gene expression and cancer classification

MethodsLogistic Regression · Linear Regression