On Suspicious Coincidences and Pointwise Mutual Information
Christopher K. I. Williams

TL;DR
This paper reviews classical and modern measures of association for contingency tables, highlighting how pointwise mutual information (PMI) and mutual information (MI) relate to traditional metrics like Yule's Y, especially regarding their sensitivity to marginal probabilities.
Contribution
The paper clarifies the relationship between PMI, MI, and classical measures like Yule's Y, emphasizing the impact of marginal probabilities on PMI's sensitivity to sparse events.
Findings
PMI and MI behave similarly to Yule's Y when marginal effects are removed.
PMI is sensitive to marginal probabilities, especially for sparse events.
Classical measures and PMI/MI are related through the odds ratio λ.
Abstract
Barlow (1985) hypothesized that the co-occurrence of two events and is "suspicious" if . We first review classical measures of association for contingency tables, including Yule's (Yule, 1912), which depends only on the odds ratio , and is independent of the marginal probabilities of the table. We then discuss the mutual information (MI) and pointwise mutual information (PMI), which depend on the ratio , as measures of association. We show that, once the effect of the marginals is removed, MI and PMI behave similarly to as functions of . The pointwise mutual information is used extensively in some research communities for flagging suspicious coincidences, but it is important to bear in mind the sensitivity of the PMI to the marginals, with increased scores for sparser events.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms
