A Comparison of Hamming Errors of Representative Variable Selection   Methods

Zheng Tracy Ke; Longlin Wang

arXiv:2203.15075·math.ST·March 30, 2022·ICLR

A Comparison of Hamming Errors of Representative Variable Selection Methods

Zheng Tracy Ke, Longlin Wang

PDF

Open Access 1 Video

TL;DR

This paper compares the Hamming error performance of six variable selection methods, including Lasso, Elastic net, and others, under theoretical conditions with correlated variables and specific coefficient distributions.

Contribution

It provides a theoretical comparison of these methods' expected Hamming errors, deriving convergence rates and phase diagrams to evaluate their effectiveness.

Findings

01

Elastic net and SCAD outperform Lasso in correlated settings.

02

Thresholded Lasso and forward backward selection show competitive Hamming errors.

03

Theoretical phase diagrams illustrate method advantages under different conditions.

Abstract

Lasso is a celebrated method for variable selection in linear models, but it faces challenges when the variables are moderately or strongly correlated. This motivates alternative approaches such as using a non-convex penalty, adding a ridge regularization, or conducting a post-Lasso thresholding. In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholded Lasso, and forward backward selection. We measure their performances theoretically by the expected Hamming error, assuming that the regression coefficients are iid drawn from a two-point mixture and that the Gram matrix is block-wise diagonal. By deriving the rates of convergence of Hamming errors and the phase diagrams, we obtain useful conclusions about the pros and cons of different methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Comparison of Hamming Errors of Representative Variable Selection Methods· slideslive

Taxonomy

TopicsStatistical Methods and Inference · Markov Chains and Monte Carlo Methods · Bayesian Methods and Mixture Models