xtdml: Double Machine Learning Estimation to Static Panel Data Models with Fixed Effects in R
Annalivia Polselli

TL;DR
The paper introduces the R package `xtdml` that applies double machine learning techniques to estimate parameters in static panel data models with fixed effects, enhancing inference with high-dimensional confounders.
Contribution
It provides a comprehensive implementation of DML methods for panel data in R, integrating machine learning for nuisance function estimation and handling fixed effects effectively.
Findings
Successful application to simulated data
Effective handling of fixed effects and high-dimensional confounders
Improved inference in panel data models
Abstract
The double machine learning (DML) method combines the predictive power of machine learning with statistical estimation to conduct inference about the structural parameter of interest. This paper presents the R package `xtdml`, which implements DML methods for partially linear panel regression models with low-dimensional fixed effects, high-dimensional confounding variables, proposed by Clarke and Polselli (2025). The package provides functionalities to: (a) learn nuisance functions with machine learning algorithms from the `mlr3` ecosystem, (b) handle unobserved individual heterogeneity choosing among first-difference transformation, within-group transformation, and correlated random effects, (c) transform the covariates with min-max normalization and polynomial expansion to improve learning performance. We showcase the use of `xtdml` with both simulated and real longitudinal data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial and Panel Data Analysis · Statistical Methods and Inference · Advanced Causal Inference Techniques
