Identifying important predictors in large data bases -- multiple testing   and model selection

Malgorzata Bogdan; Florian Frommlet

arXiv:2011.12154·stat.ME·November 25, 2020

Identifying important predictors in large data bases -- multiple testing and model selection

Malgorzata Bogdan, Florian Frommlet

PDF

TL;DR

This paper reviews and compares various model selection methods in high-dimensional data, focusing on controlling the false discovery rate and including modifications of information criteria and penalized likelihood approaches like SLOPE and SLOBE.

Contribution

It introduces and evaluates modifications of information criteria suitable for p > n scenarios and compares their performance with penalized likelihood methods in high-dimensional settings.

Findings

01

Methods effectively control FDR in model selection.

02

Penalized likelihood methods outperform traditional criteria in high-dimensional data.

03

Simulation results demonstrate varying performance depending on data conditions.

Abstract

This is a chapter of the forthcoming Handbook of Multiple Testing. We consider a variety of model selection strategies in a high-dimensional setting, where the number of potential predictors p is large compared to the number of available observations n. In particular modifications of information criteria which are suitable in case of p > n are introduced and compared with a variety of penalized likelihood methods, in particular SLOPE and SLOBE. The focus is on methods which control the FDR in terms of model identification. Theoretical results are provided both with respect to model identification and prediction and various simulation results are presented which illustrate the performance of the different methods in different situations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.