# Debiased Lasso After Sample Splitting for Estimation and Inference in   High Dimensional Generalized Linear Models

**Authors:** Omar Vazquez (Department of Biostatistics, Epidemiology and, Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, U.S.A.), and Bin Nan (Department of Statistics, University of California, Irvine,, California, U.S.A.)

arXiv: 2302.14218 · 2023-03-01

## TL;DR

This paper introduces a sample splitting approach using the debiased lasso for estimation and inference in high-dimensional generalized linear models, improving accuracy and efficiency over traditional methods.

## Contribution

It proposes a novel multiple splitting procedure with the debiased lasso that achieves asymptotic normality and reduces bias and variance in high-dimensional settings.

## Key findings

- Debiased lasso estimates follow a normal distribution asymptotically.
- Multiple splitting improves efficiency over single splitting.
- Simulation shows reduced bias and variance compared to maximum likelihood estimates.

## Abstract

We consider random sample splitting for estimation and inference in high dimensional generalized linear models, where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected model using the remaining subsample. We show that, no matter including a prespecified subset of regression coefficients or not, the debiased lasso estimation of the selected submodel after a single splitting follows a normal distribution asymptotically. Furthermore, for a set of prespecified regression coefficients, we show that a multiple splitting procedure based on the debiased lasso can address the loss of efficiency associated with sample splitting and produce asymptotically normal estimates under mild conditions. Our simulation results indicate that using the debiased lasso instead of the standard maximum likelihood estimator in the estimation stage can vastly reduce the bias and variance of the resulting estimates. We illustrate the proposed multiple splitting debiased lasso method with an analysis of the smoking data of the Mid-South Tobacco Case-Control Study.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14218/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/2302.14218/full.md

---
Source: https://tomesphere.com/paper/2302.14218