Sufficient Statistics and Split Idempotents in Discrete Probability Theory
Bart Jacobs

TL;DR
This paper explores the concept of sufficient statistics within discrete probability theory, using string diagrams and split idempotents to provide a new categorical perspective on their fundamental properties.
Contribution
It introduces a categorical framework for understanding sufficient statistics in discrete probability, emphasizing the role of split idempotents and dagger structures.
Findings
Identifies fundamental examples of sufficient statistics in discrete probability
Reveals the role of dagger split idempotents in the structure of sufficient statistics
Shows that a sufficient statistic is a deterministic dagger epi
Abstract
A sufficient statistic is a deterministic function that captures an essential property of a probabilistic function (channel, kernel). Being a sufficient statistic can be expressed nicely in terms of string diagrams, as Tobias Fritz showed recently, in adjoint form. This reformulation highlights the role of split idempotents, in the Fisher-Neyman factorisation theorem. Examples of a sufficient statistic occur in the literature, but mostly in continuous probability. This paper demonstrates that there are also several fundamental examples of a sufficient statistic in discrete probability. They emerge after some combinatorial groundwork that reveals the relevant dagger split idempotents and shows that a sufficient statistic is a deterministic dagger epi.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · semigroups and automata theory
