Positive-Unlabelled Survival Data Analysis
Tomoki Toyabe, Yasuhiro Hasegawa, and Takahiro Hoshino

TL;DR
This paper introduces a new framework for analyzing positive-unlabeled survival data, addressing challenges in bias and estimation accuracy with novel models and strategies.
Contribution
It develops parametric, nonparametric, and machine learning models for positive-unlabeled survival data, including estimation methods for cases with and without observed censoring times.
Findings
Traditional survival analysis can be biased under this data setup.
Proposed methods yield valid survival estimates in simulations.
Models outperform existing approaches in handling positive-unlabeled data.
Abstract
In this paper, we consider a novel framework of positive-unlabeled data in which as positive data survival times are observed for subjects who have events during the observation time as positive data and as unlabeled data censoring times are observed but whether the event occurs or not are unknown for some subjects. We consider two cases: (1) when censoring time is observed in positive data, and (2) when it is not observed. For both cases, we developed parametric models, nonparametric models, and machine learning models and the estimation strategies for these models. Simulation studies show that under this data setup, traditional survival analysis may yield severely biased results, while the proposed estimation method can provide valid results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning and Data Classification · Machine Learning and Algorithms
