# Similarity-based Random Survival Forest

**Authors:** Yingying Xu, Joon Lee, Joel A. Dubin

arXiv: 1903.01029 · 2019-08-06

## TL;DR

This paper introduces a modified random survival forest method that incorporates similarity measures to improve the accuracy of predicting time-to-event outcomes in heterogeneous medical datasets, demonstrated on ICU data.

## Contribution

The paper proposes a novel similarity-based modification to the random survival forest algorithm, enhancing prediction accuracy for survival analysis in complex, heterogeneous datasets.

## Key findings

- Improved predictive accuracy over standard random survival forests.
- Effective in ICU datasets like MIMIC-III.
- Validated through comprehensive simulation studies.

## Abstract

Predicting time-to-event outcomes in large databases can be a challenging but important task. One example of this is in predicting the time to a clinical outcome for patients in intensive care units (ICUs), which helps to support critical medical treatment decisions. In this context, the time to an event of interest could be, for example, survival time or time to recovery from a disease/ailment observed within the ICU. The massive health datasets generated from the uptake of Electronic Health Records (EHRs) are quite heterogeneous as patients can be quite dissimilar in their relationship between the feature vector and the outcome, adding more noise than information to prediction. In this paper, we propose a modified random forest method for survival data that identifies similar cases in an attempt to improve accuracy for predicting time-to-event outcomes; this methodology can be applied in various settings, including with ICU databases. We also introduce an adaptation of our methodology in the case of dependent censoring. Our proposed method is demonstrated in the Medical Information Mart for Intensive Care (MIMIC-III) database, and, in addition, we present properties of our methodology through a comprehensive simulation study. Introducing similarity to the random survival forest method indeed provides improved predictive accuracy compared to random survival forest alone across the various analyses we undertook.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.01029/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1903.01029/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1903.01029/full.md

---
Source: https://tomesphere.com/paper/1903.01029