Formalization of the generalized Pareto principle and structural typicality of the 20/80-rule
Antti Hippel\"ainen

TL;DR
This paper formalizes a generalized Pareto principle using gain densities and Lorenz curves, deriving explicit formulas for common distributions and analyzing dataset size effects on the 20/80-rule.
Contribution
It introduces a formal framework connecting the Pareto principle with Lorenz curves and provides closed-form expressions for various distributions.
Findings
Datasets of size 10^2 to 10^5 from exponential and normal distributions have p near 0.15 to 0.29.
The p-values are close to but below the 0.2/0.8-rule.
The study discusses the structural ubiquity of Pareto-type imbalances.
Abstract
We formalize a generalized form of the Pareto principle - ``fraction of inputs yields fraction of outputs'' - as a property of non-negative gain densities , working with the decreasing rearrangement to obtain a unique characterization. For probability distributions, the resulting coincides with , where is the Kolkata index of the corresponding Lorenz curve. Within this framework we analyze both constructed gain densities and commonly encountered distribution families. We derive closed-form expressions for for truncated power-law, exponential, and normal distribution families. Combining these with estimates of the truncation parameter as a function of sample size , we predict that datasets of size from exponential and normal families concentrate near and - values close to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
