Causal statistical modeling and calculation of distribution functions of classification features
Uwe Petersohn, Thomas Dedek, Sandra Zimmer, Hans Biskupski

TL;DR
This paper introduces a new statistical model for classification feature distributions based on entropy optimization, providing efficient algorithms and demonstrating good approximation quality across various domains.
Contribution
It develops a novel entropy-based model for classification distributions, along with algorithms for probability computation and distribution approximation.
Findings
The model accurately approximates real distributions with 3-5% error.
Algorithms are efficient for practical computation.
The model compares favorably to Zipf's law in examples.
Abstract
Statistical system models provide the basis for the examination of various sorts of distributions. Classification distributions are a very common and versatile form of statistics in e.g. real economic, social, and IT systems. The statistical distributions of classification features can be applied in determining the a priori probabilities in Bayesian networks. We investigate a statistical model of classification distributions based on finding the critical point of a specialized form of entropy. A distribution function for classification features is derived, with the two parameters , minimal class, and , average number of classes. Efficient algorithms for the computation of the class probabilities and the approximation of real frequency distributions are developed and applied to examples from different domains. The method is compared to established distributions like Zipf's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Time Series Analysis and Forecasting · Data Mining Algorithms and Applications
