Optimal subsampling for quantile regression in big data

HaiYing Wang; Yanyuan Ma

arXiv:2001.10168·stat.CO·January 29, 2020·5 cites

Optimal subsampling for quantile regression in big data

HaiYing Wang, Yanyuan Ma

PDF

Open Access

TL;DR

This paper develops optimal subsampling strategies for large-scale quantile regression, providing scalable algorithms with theoretical guarantees and practical advantages such as avoiding density estimation.

Contribution

It introduces two types of optimal subsampling probabilities for quantile regression, along with scalable algorithms and asymptotic optimality proofs.

Findings

01

Algorithms achieve asymptotic optimality.

02

Method works well with simulated and real data.

03

Standard errors obtained without density estimation.

Abstract

We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator and the other minimizes that of the original parameter estimator. The former does not depend on the densities of the responses given covariates and is easy to implement. Algorithms based on optimal subsampling probabilities are proposed and asymptotic distributions and asymptotic optimality of the resulting estimators are established. Furthermore, we propose an iterative subsampling procedure based on the optimal subsampling probabilities in the linearly transformed parameter estimation which has great scalability to utilize available computational resources. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models