Tree-based methods for length-biased survival data

Jinwoo Lee; Donghwan Lee; Hyunwoo Lee; and Jiyu Sun

arXiv:2508.16312·stat.ME·December 23, 2025

Tree-based methods for length-biased survival data

Jinwoo Lee, Donghwan Lee, Hyunwoo Lee, and Jiyu Sun

PDF

TL;DR

This paper introduces new tree-based methods tailored for length-biased survival data, improving efficiency and accuracy in survival prediction by accounting for the unique sampling bias in prevalent cohort studies.

Contribution

The paper develops survival trees and forests specifically designed for length-biased data, utilizing a score function for better variable selection and unbiased survival estimation.

Findings

01

Simulation studies show improved tree recovery and prediction accuracy.

02

Methods outperform traditional approaches in efficiency and bias correction.

03

Application to lung cancer data demonstrates practical utility.

Abstract

Left-truncated survival data commonly arise in prevalent cohort studies, where only individuals who have experienced disease onset and survived until enrollment in the study. When the onset process follows a stationary Poisson process, the resulting data are length-biased. This sampling mechanism induces a selection bias towards longer survival individuals, and statistical methods for traditional survival data are not directly applicable. While tree-based methods developed for left-truncated data can be applied, they may be inefficient for length-biased data, as they do not account for the distribution of truncation times. To address this, we propose new survival trees and forests for length-biased right-censored data within the conditional inference framework. Our approach uses a score function derived from the full likelihood to construct permutation test statistics for variable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.