Two-stage Sampling, Prediction and Adaptive Regression via Correlation Screening (SPARCS)
Hamed Firouzi, Alfred Hero, Bala Rajaratnam

TL;DR
SPARCS is an adaptive two-stage procedure for high-dimensional predictor design that balances sampling cost and prediction accuracy by screening variables early and collecting more data on promising candidates.
Contribution
It introduces a novel two-stage adaptive sampling method with false positive control, optimal sample allocation, and theoretical guarantees for high-dimensional prediction tasks.
Findings
Establishes asymptotic bounds for Familywise Error Rate (FWER)
Provides high-dimensional support recovery convergence rates
Derives optimal sample allocation strategies between stages
Abstract
This paper proposes a general adaptive procedure for budget-limited predictor design in high dimensions called two-stage Sampling, Prediction and Adaptive Regression via Correlation Screening (SPARCS). SPARCS can be applied to high dimensional prediction problems in experimental science, medicine, finance, and engineering, as illustrated by the following. Suppose one wishes to run a sequence of experiments to learn a sparse multivariate predictor of a dependent variable (disease prognosis for instance) based on a dimensional set of independent variables (assayed biomarkers). Assume that the cost of acquiring the full set of variables increases linearly in its dimension. SPARCS breaks the data collection into two stages in order to achieve an optimal tradeoff between sampling cost and predictor performance. In the first stage we collect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
