# A Generalized Multinomial Distribution from Dependent Categorical Random   Variables

**Authors:** Rachel Traylor

arXiv: 1701.06955 · 2017-01-25

## TL;DR

This paper introduces a mathematical framework for modeling dependent categorical variables using a generalized multinomial distribution, addressing correlation effects in machine learning applications.

## Contribution

It provides a novel construction of dependent categorical variables and derives a generalized multinomial distribution to model their count probabilities.

## Key findings

- Mathematical construction of dependent categorical variables
- Derivation of a generalized multinomial distribution
- Potential impact on machine learning algorithms

## Abstract

Categorical random variables are a common staple in machine learning methods and other applications across disciplines. Many times, correlation within categorical predictors exists, and has been noted to have an effect on various algorithm effectiveness, such as feature ranking and random forests. We present a mathematical construction of a sequence of identically distributed but dependent categorical random variables, and give a generalized multinomial distribution to model the probability of counts of such variables.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1701.06955/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1701.06955/full.md

## References

9 references — full list in the complete paper: https://tomesphere.com/paper/1701.06955/full.md

---
Source: https://tomesphere.com/paper/1701.06955