Robust inference under Benford's law
Lucio Barabesi, Andrea Cerioli, Andrea Cerasa, Domenico Perrotta

TL;DR
This paper develops new statistical methods to detect manipulation of data conforming to Benford's law, especially when operators intentionally try to evade detection by mimicking natural digit distributions.
Contribution
It introduces a contamination model for digits, new distributional results, and goodness-of-fit tests tailored for adversarial data manipulation scenarios.
Findings
Proposed tests can identify manipulated data patterns.
Empirical evaluation demonstrates effectiveness of the methods.
Application to trade data shows practical relevance.
Abstract
We address the task of identifying anomalous observations by analyzing digits under the lens of Benford's law. Motivated by the crucial objective of providing reliable statistical analysis of customs declarations, we answer one major and still open question: How can we detect the behavior of operators who are aware of the prevalence of the Benford's pattern in the digits of regular observations and try to manipulate their data in such a way that the same pattern also holds after data fabrication? This challenge arises from the ability of highly skilled and strategically minded manipulators in key organizational positions or criminal networks to exploit statistical knowledge and evade detection. For this purpose, we write a specific contamination model for digits, obtain new relevant distributional results and derive appropriate goodness-of-fit statistics for the considered adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBenford’s Law and Fraud Detection · Complex Systems and Time Series Analysis · Statistical Mechanics and Entropy
