Bayesian Cox Regression for Large-scale Inference with Applications to Electronic Health Records
Alexander W. Jung, Moritz Gerstung

TL;DR
This paper introduces a Bayesian Cox regression method tailored for large-scale, high-dimensional biomedical datasets, enabling efficient inference and uncertainty estimation in time-to-event analysis.
Contribution
It develops a scalable Bayesian approach using stochastic variational inference for Cox models, suitable for datasets with millions of records and thousands of covariates.
Findings
Effective in large-scale datasets with millions of data points.
Provides reliable uncertainty estimates in high-dimensional settings.
Demonstrated utility on UK Biobank myocardial infarction data.
Abstract
The Cox model is an indispensable tool for time-to-event analysis, particularly in biomedical research. However, medicine is undergoing a profound transformation, generating data at an unprecedented scale, which opens new frontiers to study and understand diseases. With the wealth of data collected, new challenges for statistical inference arise, as datasets are often high dimensional, exhibit an increasing number of measurements at irregularly spaced time points, and are simply too large to fit in memory. Many current implementations for time-to-event analysis are ill-suited for these problems as inference is computationally demanding and requires access to the full data at once. Here we propose a Bayesian version for the counting process representation of Cox's partial likelihood for efficient inference on large-scale datasets with millions of data points and thousands of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Advanced Statistical Process Monitoring
