Linear shrinkage for predicting responses in large-scale multivariate   linear regression

Yihe Wang; Sihai Dave Zhao

arXiv:2104.08970·stat.ME·April 20, 2021

Linear shrinkage for predicting responses in large-scale multivariate linear regression

Yihe Wang, Sihai Dave Zhao

PDF

Open Access

TL;DR

This paper introduces a computationally efficient, tuning-free linear shrinkage method for large-scale multivariate linear regression, outperforming ordinary least squares without requiring structural assumptions.

Contribution

It proposes a novel, scalable shrinkage approach for multivariate regression that avoids parameter tuning and performs well in high-dimensional, large-scale settings.

Findings

01

Outperforms ordinary least squares asymptotically

02

Computationally efficient and tuning-free

03

Effective in high-dimensional, large-scale data

Abstract

We propose a new prediction method for multivariate linear regression problems where the number of features is less than the sample size but the number of outcomes is extremely large. Many popular procedures, such as penalized regression procedures, require parameter tuning that is computationally untenable in such large-scale problems. We take a different approach, motivated by ideas from simultaneous estimation problems, that performs linear shrinkage on ordinary least squares parameter estimates. Our approach is extremely computationally efficient and tuning-free. We show that it can asymptotically outperform ordinary least squares without any structural assumptions on the true regression coefficients and illustrate its good performance in simulations and an analysis of single-cell RNA-seq data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSingle-cell and spatial transcriptomics · Statistical Methods and Inference · Gene expression and cancer classification