Are your data really Pareto distributed?
Pasquale Cirillo

TL;DR
This paper critically examines common graphical methods used to identify Pareto distributions in data, highlighting their limitations and proposing additional tools for more reliable analysis.
Contribution
It provides a comprehensive review of existing plots for Pareto detection and introduces new tools to improve the reliability of identifying power-law behavior.
Findings
Graphical tools like Zipf and mean excess plots can be misleading when used alone.
A combination of multiple plots increases confidence in Pareto distribution detection.
Proposed additional tools help refine the analysis of power-law presence in data.
Abstract
Pareto distributions, and power laws in general, have demonstrated to be very useful models to describe very different phenomena, from physics to finance. In recent years, the econophysical literature has proposed a large amount of papers and models justifying the presence of power laws in economic data. Most of the times, this Paretianity is inferred from the observation of some plots, such as the Zipf plot and the mean excess plot. If the Zipf plot looks almost linear, then everything is ok and the parameters of the Pareto distribution are estimated. Often with OLS. Unfortunately, as we show in this paper, these heuristic graphical tools are not reliable. To be more exact, we show that only a combination of plots can give some degree of confidence about the real presence of Paretianity in the data. We start by reviewing some of the most important plots, discussing their points of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Economic theories and models · Monetary Policy and Economic Impact
