# Determining Principal Component Cardinality through the Principle of   Minimum Description Length

**Authors:** Ami Tavory

arXiv: 1901.00059 · 2019-07-02

## TL;DR

This paper explores how the Minimum Description Length principle can be used to objectively determine the optimal number of principal components in PCA by reducing the problem to linear regression NML bounds.

## Contribution

It introduces a reduction technique that bounds PCA NML using linear regression NML, facilitating objective model selection in PCA.

## Key findings

- Bounded PCA NML using linear regression NML
- Provided a theoretical framework for PCA component selection
- Enhanced understanding of MDL application in PCA

## Abstract

PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.00059/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1901.00059/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1901.00059/full.md

---
Source: https://tomesphere.com/paper/1901.00059