Feature Selection via Block-Regularized Regression

Seyoung Kim; Eric P. Xing

arXiv:1206.3268·stat.ME·June 18, 2012·UAI·2 cites

Feature Selection via Block-Regularized Regression

Seyoung Kim, Eric P. Xing

PDF

Open Access

TL;DR

This paper introduces a block-regularized regression model that effectively identifies contiguous relevant feature blocks in high-dimensional, ordered data, improving variable selection in complex biological and genomic applications.

Contribution

It proposes a novel sparse regression framework with Laplacian prior and Markovian process to detect contiguous feature blocks in high-dimensional ordered data.

Findings

01

Successfully identifies relevant feature blocks in simulated data

02

Demonstrates improved marker detection in biological genome data

03

Employs a sampling-based algorithm for model learning

Abstract

Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Statistical Methods and Inference · Genetic Associations and Epidemiology