
TL;DR
This paper explores how fraudsters can manipulate data to still follow Benford's law, highlighting limitations of its use in fraud detection and proposing a flexible family of distributions for such manipulations.
Contribution
It introduces a general family of distributions that allows data manipulation to conform to Benford's law while controlling key dataset parameters.
Findings
Fraudsters can manipulate data to follow Benford's law with various constraints.
Benford's law alone may be insufficient for reliable fraud detection.
The paper provides a framework for understanding data manipulation under Benford's law.
Abstract
Benford's law is widely used for fraud-detection nowadays. The underlying assumption for using the law is that a "regular" dataset follows the significant digit phenomenon. In this paper, we address the scenario where a shrewd fraudster manipulates a list of numbers in such a way that still complies with Benford's law. We develop a general family of distributions that provides several degrees of freedom to such a fraudster such as minimum, maximum, mean and size of the manipulated dataset. The conclusion further corroborates the idea that Benford's law should be used with utmost discretion as a means for fraud detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBenford’s Law and Fraud Detection · Digital Media Forensic Detection · Imbalanced Data Classification Techniques
