A Panel Quantile Approach to Attrition Bias in Big Data: Evidence from a Randomized Experiment
Matthew Harding, Carlos Lamarche

TL;DR
This paper develops a panel quantile regression method to address attrition bias in Big Data, providing a two-step estimator that is computationally feasible and effective in large-scale applications.
Contribution
It introduces a novel two-step quantile regression estimator for panel data with attrition, extending existing methods to Big Data contexts with practical implementation.
Findings
Estimator is computationally efficient for large datasets.
Monte Carlo simulations show good finite sample properties.
Applied to electricity pricing experiment, demonstrating practical usefulness.
Abstract
This paper introduces a quantile regression estimator for panel data models with individual heterogeneity and attrition. The method is motivated by the fact that attrition bias is often encountered in Big Data applications. For example, many users sign-up for the latest program but few remain active users several months later, making the evaluation of such interventions inherently very challenging. Building on earlier work by Hausman and Wise (1979), we provide a simple identification strategy that leads to a two-step estimation procedure. In the first step, the coefficients of interest in the selection equation are consistently estimated using parametric or nonparametric methods. In the second step, standard panel quantile methods are employed on a subset of weighted observations. The estimator is computationally easy to implement in Big Data applications with a large number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial and Panel Data Analysis · Economic and Environmental Valuation · Housing Market and Economics
