# Sparse vertex discriminant analysis: Variable selection for biomedical classification applications

**Authors:** Alfonso Landeros, Seyoon Ko, Jack Z. Chang, Tong Tong Wu, Kenneth Lange

PMC · DOI: 10.1016/j.csda.2025.108125 · Computational statistics & data analysis · 2025-06-01

## TL;DR

This paper introduces a new method for classifying biomedical data by selecting important variables while handling complex data structures.

## Contribution

The paper proposes two versions of sparse Vertex Discriminant Analysis (VDA) for class-specific variable selection in biomedical classification.

## Key findings

- Sparse VDA adapts well to class-specific variable selection in simulated and real datasets.
- The method is particularly effective for cancer classification using gene expression data.
- Nonconvex distance-to-set penalties control the number of active variables in VDA.

## Abstract

Modern biomedical datasets are often high-dimensional at multiple levels of biological organization. Practitioners must therefore grapple with data to estimate sparse or low-rank structures so as to adhere to the principle of parsimony. Further complicating matters is the presence of groups in data, each of which may have distinct associations with explanatory variables or be characterized by fundamentally different covariates. These themes in data analysis are explored in the context of classification. Vertex Discriminant Analysis (VDA) offers flexible linear and nonlinear models for classification that generalize the advantages of support vector machines to data with multiple classes. The proximal distance principle, which leverages projection and proximal operators in the design of practical algorithms, handily facilitates variable selection in VDA via nonconvex distance-to-set penalties directly controlling the number of active variables. Two flavors of sparse VDA are developed to address data in which instances may be homogeneous or heterogeneous with respect to predictors characterizing classes. Empirical studies illustrate how VDA is adapted to class-specific variable selection on simulated and real datasets, with an emphasis on applications to cancer classification via gene expression patterns.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12122019/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12122019/full.md

## References

76 references — full list in the complete paper: https://tomesphere.com/paper/PMC12122019/full.md

---
Source: https://tomesphere.com/paper/PMC12122019