A Nonparametric Bayesian Technique for High-Dimensional Regression
Subharup Guha, Veerabhadran Baladandayuthapani

TL;DR
This paper introduces VariScan, a nonparametric Bayesian method that effectively clusters, selects variables, and predicts in high-dimensional regression, demonstrating superior performance through simulations and real data analysis.
Contribution
It develops a novel Bayesian framework using Poisson-Dirichlet processes for clustering and variable selection in high-dimensional regression, with proven consistency and improved accuracy.
Findings
VariScan outperforms existing methods in simulations.
Cluster detection is consistent as data dimensions grow.
The method balances model simplicity and flexibility.
Abstract
This paper proposes a nonparametric Bayesian framework called VariScan for simultaneous clustering, variable selection, and prediction in high-throughput regression settings. Poisson-Dirichlet processes are utilized to detect lower-dimensional latent clusters of covariates. An adaptive nonlinear prediction model is constructed for the response, achieving a balance between model parsimony and flexibility. Contrary to conventional belief, cluster detection is shown to be aposteriori consistent for a general class of models as the number of covariates and subjects grows. Simulation studies and data analyses demonstrate that VariScan often outperforms several well-known statistical methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
