Shrinkage for Categorical Regressors
Phillip Heiler, Jana Mareckova

TL;DR
This paper proposes a new regularization method for categorical regressors that improves estimation accuracy by shrinking group means towards informative estimates, with theoretical analysis and practical demonstrations showing its advantages.
Contribution
It introduces a flexible, closed-form shrinkage estimator for categorical regressors, deriving optimal penalties and demonstrating asymptotic and finite-sample improvements over traditional methods.
Findings
Plug-in estimator outperforms OLS in large samples with more than three groups.
Monte Carlo simulations show robust finite-sample improvements.
Real data applications illustrate practical benefits in panel and difference-in-differences analyses.
Abstract
This paper introduces a flexible regularization approach that reduces point estimation risk of group means stemming from e.g. categorical regressors, (quasi-)experimental data or panel data models. The loss function is penalized by adding weighted squared l2-norm differences between group location parameters and informative first-stage estimates. Under quadratic loss, the penalized estimation problem has a simple interpretable closed-form solution that nests methods established in the literature on ridge regression, discretized support smoothing kernels and model averaging methods. We derive risk-optimal penalty parameters and propose a plug-in approach for estimation. The large sample properties are analyzed in an asymptotic local to zero framework by introducing a class of sequences for close and distant systems of locations that is sufficient for describing a large range of data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
