On the use of cross-fitting in causal machine learning with correlated units

Salvador V. Balkus; Hasan Laith; and Nima S. Hejazi

arXiv:2601.10899·stat.ME·May 12, 2026

On the use of cross-fitting in causal machine learning with correlated units

Salvador V. Balkus, Hasan Laith, and Nima S. Hejazi

PDF

TL;DR

This paper demonstrates that standard cross-fitting methods in causal machine learning effectively reduce bias even with correlated data, challenging the need for complex correlation-aware procedures.

Contribution

It proves that ignoring correlations in cross-fitting still removes key bias terms, simplifying causal inference in correlated data settings.

Findings

01

Cross-fitting eliminates bias even with correlated units.

02

Ignoring correlation in cross-fitting can improve estimator bias and precision.

03

Simulation results support the effectiveness of standard cross-fitting methods.

Abstract

In causal machine learning, the fitting and evaluation of nuisance models are often performed on separate partitions, or folds, of the observed data. This technique, called cross-fitting, eliminates bias introduced by the use of black-box predictive algorithms. When study units may be correlated, such as in spatial, clustered, or time-series data, investigators often design bespoke forms of cross-fitting to minimize correlation between folds. We prove that, perhaps contrary to popular belief, this is typically unnecessary: performing cross fitting as if study units were independent still eliminates key bias terms even when units may be correlated. In simulation experiments with various correlation structures, we show that causal machine learning estimators achieve the same or improved bias and precision under cross-fitting that ignores correlation compared to techniques striving to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.