# Raking and Regression Calibration: Methods to Address Bias from   Correlated Covariate and Time-to-Event Error

**Authors:** Eric J. Oh, Bryan E. Shepherd, Thomas Lumley, and Pamela A. Shaw

arXiv: 1905.08330 · 2020-03-10

## TL;DR

This paper introduces raking and regression calibration methods to correct for correlated measurement errors in covariates and time-to-event data, improving bias reduction in EHR-based medical studies.

## Contribution

It develops novel raking estimators that are consistent for failure-time data and outperform traditional regression calibration in certain settings with correlated errors.

## Key findings

- Raking estimators reduce bias more effectively than RC in simulated scenarios.
- Proposed methods perform well under outcome-dependent sampling.
- Application to HIV EHR data demonstrates practical utility.

## Abstract

Medical studies that depend on electronic health records (EHR) data are often subject to measurement error, as the data are not collected to support research questions under study. These data errors, if not accounted for in study analyses, can obscure or cause spurious associations between patient exposures and disease risk. Methodology to address covariate measurement error has been well developed; however, time-to-event error has also been shown to cause significant bias but methods to address it are relatively underdeveloped. More generally, it is possible to observe errors in both the covariate and the time-to-event outcome that are correlated. We propose regression calibration (RC) estimators to simultaneously address correlated error in the covariates and the censored event time. Although RC can perform well in many settings with covariate measurement error, it is biased for nonlinear regression models, such as the Cox model. Thus, we additionally propose raking estimators which are consistent estimators of the parameter defined by the population estimating equation. Raking can improve upon RC in certain settings with failure-time data, require no explicit modeling of the error structure, and can be utilized under outcome-dependent sampling designs. We discuss features of the underlying estimation problem that affect the degree of improvement the raking estimator has over the RC approach. Detailed simulation studies are presented to examine the performance of the proposed estimators under varying levels of signal, error, and censoring. The methodology is illustrated on observational EHR data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.08330/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1905.08330/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1905.08330/full.md

---
Source: https://tomesphere.com/paper/1905.08330