# Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic   Distributions

**Authors:** Alexa A. Sochaniwsky, Michael P. B. Gallaugher, Yang Tang and, Paul D. McNicholas

arXiv: 1903.05054 · 2024-06-07

## TL;DR

This paper introduces a flexible clustering method for high-dimensional data using a sparse mixture of generalized hyperbolic distributions, addressing heavy tails and asymmetry with a penalized EM algorithm.

## Contribution

It proposes a novel parametrization with a penalty term for the mixture model, enabling effective clustering of complex high-dimensional data.

## Key findings

- The method performs well in simulations.
- It effectively handles heavy-tailed, asymmetric data.
- Demonstrated on real datasets.

## Abstract

Robust clustering of high-dimensional data is an important topic because clusters in real datasets are often heavy-tailed and/or asymmetric. Traditional approaches to model-based clustering often fail for high dimensional data, e.g., due to the number of free covariance parameters. A parametrization of the component scale matrices for the mixture of generalized hyperbolic distributions is proposed. This parameterization includes a penalty term in the likelihood. An analytically feasible expectation-maximization algorithm is developed by placing a gamma-lasso penalty constraining the concentration matrix. The proposed methodology is investigated through simulation studies and illustrated using two real datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.05054/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1903.05054/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/1903.05054/full.md

---
Source: https://tomesphere.com/paper/1903.05054