How to (Not) Estimate Gini Coefficients for Fat Tailed Variables
Nassim Nicholas Taleb

TL;DR
This paper critically examines the limitations of conventional Gini coefficient estimation methods for fat-tailed variables, proposing improved indirect estimation techniques that reduce bias and error.
Contribution
It introduces a novel indirect estimation approach using maximum likelihood for tail exponents, addressing biases in traditional Gini calculations for fat-tailed data.
Findings
Conventional Gini estimators are unreliable for fat-tailed variables.
Indirect tail-based methods outperform traditional approaches in accuracy.
Proposed methodology offers a simple, efficient way to estimate Gini coefficients with lower error.
Abstract
Direct measurements of Gini coefficients by conventional arithmetic calculations are a poor estimator, even if paradoxically, they include the entire population, as because of super-additivity they cannot lend themselves to comparisons between units of different size, and intertemporal analyses are vitiated by the population changes. The Gini of aggregated units A and B will be higher than those of A and B computed separately. This effect becomes more acute with fatness of tails. When the sample size is smaller than entire population, the error is extremely high. The conventional literature on Gini coefficients cannot be trusted and comparing countries of different sizes makes no sense; nor does it make sense to make claims of "changes in inequality" based on conventional measures. We compare the standard methodologies to the indirect methods via maximum likelihood estimation of tail…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
