TL;DR
crossfit is an R package designed for flexible, reproducible cross-fitting in semiparametric estimation, supporting complex DAG-based nuisance models and various fold-sharing strategies.
Contribution
it introduces a general-purpose, estimator-agnostic cross-fitting engine with explicit control over fold structure, dependence, and computational efficiency, suitable for benchmarking and method development.
Findings
implements fold-sharing modes including disjoint and independence-enforcing allocations
provides explicit, auditable schedules for cross-fitting procedures
supports simulation-heavy benchmarking and method development workflows
Abstract
Cross-fitting is a key ingredient in many semiparametric estimation procedures, such as double/debiased machine learning (DML), enabling valid estimation of low-dimensional targets in the presence of high-dimensional nuisance functions by enforcing out-of-sample use of nuisance predictions. crossfit is an R package that provides a general-purpose, estimator-agnostic cross-fitting engine. Users specify (i) a target functional and (ii) a directed acyclic graph (DAG) of nuisance models, with node-specific training fold widths and target-specific evaluation windows. The engine executes a reproducible schedule over folds, panels, and repetitions, returning either a scalar estimate (mode="estimate") or a cross-fitted predictor function for application to new data (mode="predict"). Beyond standard cross-fitting, crossfit implements fold-allocation modes that control how training data are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
