Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: A Sure Screening Approach
Mladen Kolar, Eric P. Xing

TL;DR
This paper introduces a scalable variable screening method using Simultaneous Orthogonal Matching Pursuit (S-OMP) for ultra-high dimensional multi-task regression, demonstrating its theoretical guarantees and empirical advantages in reducing variables efficiently.
Contribution
The paper presents the first application of S-OMP for multi-output variable screening, with theoretical guarantees and a modified BIC for iteration selection.
Findings
S-OMP effectively reduces variables below sample size without losing relevant ones.
Joint multi-output screening outperforms separate variable selection.
Empirical results show strong performance in simulations and genetic mapping.
Abstract
We propose a novel application of the Simultaneous Orthogonal Matching Pursuit (S-OMP) procedure for sparsistant variable selection in ultra-high dimensional multi-task regression problems. Screening of variables, as introduced in \cite{fan08sis}, is an efficient and highly scalable way to remove many irrelevant variables from the set of all variables, while retaining all the relevant variables. S-OMP can be applied to problems with hundreds of thousands of variables and once the number of variables is reduced to a manageable size, a more computationally demanding procedure can be used to identify the relevant variables for each of the regression outputs. To our knowledge, this is the first attempt to utilize relatedness of multiple outputs to perform fast screening of relevant variables. As our main theoretical contribution, we prove that, asymptotically, S-OMP is guaranteed to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Distributed Sensor Networks and Detection Algorithms
