Correcting for bias due to categorisation based on cluster analysis using multiple continuous error-prone exposures
Timm Intemann, Iris Pigeot

TL;DR
This paper introduces three novel algorithms to correct bias in cluster-based exposure pattern analysis caused by measurement error in continuous variables, demonstrating that MI-based correction performs best in nutritional epidemiology simulations.
Contribution
The paper develops and compares three new correction algorithms (RC, SIMEX, MI) for bias due to measurement error in cluster analysis of continuous exposures, with MI being recommended.
Findings
MI-based correction outperforms other methods in simulations
Correction methods reduce bias in effect estimates
MI approach effectively incorporates outcome information
Abstract
The association between multidimensional exposure patterns and outcomes is commonly investigated by first applying cluster analysis algorithms to derive patterns and then estimating the associations. However, errors in the underlying continuous, possibly skewed, exposure variables lead to misclassified exposure patterns and therefore to biased effect estimates. This is often the case for lifestyle exposures in epidemiology, e.g. for dietary variables measured on daily basis. We introduce three new algorithms for correcting the biased effect estimates, which are based on regression calibration (RC), simulation extrapolation (SIMEX) and multiple imputation (MI). In addition, the naive method ignoring the measurement error structure is considered for comparison. These methods are combined with the k-means cluster algorithm and the Gaussian mixture model to derive exposure patterns. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNutritional Studies and Diet · Statistical Methods in Epidemiology · Advanced Clustering Algorithms Research
