# Comparing variable and feature selection strategies for prediction - protocol of a simulation study in low-dimensional transplantation data

**Authors:** Linard Hoessly, Jaromil Frossard, Simon Schwab, Frédérique Chammartin, Alexander Leichtle, Peter Werner Schreiber, Dionysios Neofytos, Michael Koller, Syed Nisar Hussain Bukhari, Syed Nisar Hussain Bukhari, Syed Nisar Hussain Bukhari

PMC · DOI: 10.1371/journal.pone.0328696 · PLOS One · 2025-08-01

## TL;DR

This paper outlines a simulation study to compare variable selection methods in low-dimensional clinical data using both traditional and machine learning approaches.

## Contribution

The study introduces a structured protocol for comparing variable selection strategies in low-dimensional clinical prediction models.

## Key findings

- Six statistical learning approaches will be compared for predictive accuracy and variability.
- The study will assess both predictive and descriptive accuracy of variable selection methods.
- A simulation-based framework is proposed for evaluating variable selection in clinical data.

## Abstract

The integration of machine learning methodologies has become prevalent in the development of clinical prediction models, often suggesting superior performance compared to traditional statistical techniques. Within the scope of low-dimensional datasets, encompassing both classical and machine learning paradigms, we plan to undertake a comparison of variable selection methodologies through simulation-based analysis. The principal aim is the comparison of the variable selection strategies with respect to relative predictive accuracy and its variability, with a secondary aim the comparison of descriptive accuracy. We use six distinct statistical learning approaches across both data generation and model learning. The present manuscript is a protocol for the corresponding simulation study registration (Study registration Open Science Framework ID: k6c8f). We describe the planned steps through the Aims, Data, Estimands, Methods, and Performance framework for simulation study design and reporting.

## Full-text entities

- **Diseases:** DGM (MESH:D041781), infectious diseases (MESH:D003141)
- **Chemicals:** latex (MESH:D007840), PONE-D-25-23109R1 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12316309/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12316309/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12316309/full.md

---
Source: https://tomesphere.com/paper/PMC12316309