Semi-supervised Approach to Event Time Annotation Using Longitudinal   Electronic Health Records

Liang Liang; Jue Hou; Hajime Uno; Kelly Cho; Yanyuan Ma; Tianxi Cai

arXiv:2110.09612·stat.ME·October 20, 2021

Semi-supervised Approach to Event Time Annotation Using Longitudinal Electronic Health Records

Liang Liang, Jue Hou, Hajime Uno, Kelly Cho, Yanyuan Ma, Tianxi Cai

PDF

Open Access

TL;DR

This paper introduces a semi-supervised multi-modal method for annotating event times in electronic health records, combining functional data analysis and penalized modeling to improve accuracy in clinical outcome prediction.

Contribution

It presents a novel two-step semi-supervised approach that leverages longitudinal EHR data for accurate event time annotation, addressing limitations of manual annotation and simple code-based estimates.

Findings

01

Outperforms existing methods in simulations

02

Accurately annotates lung cancer recurrence times

03

Demonstrates root-n consistency of estimators

Abstract

Large clinical datasets derived from insurance claims and electronic health record (EHR) systems are valuable sources for precision medicine research. These datasets can be used to develop models for personalized prediction of risk or treatment response. Efficiently deriving prediction models using real world data, however, faces practical and methodological challenges. Precise information on important clinical outcomes such as time to cancer progression are not readily available in these databases. The true clinical event times typically cannot be approximated well based on simple extracts of billing or procedure codes. Whereas, annotating event times manually is time and resource prohibitive. In this paper, we propose a two-step semi-supervised multi-modal automated time annotation (MATA) method leveraging multi-dimensional longitudinal EHR encounter records. In step I, we employ a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Statistical Methods and Inference