Simpson's Paradox with Any Given Number of Factors
Guisheng Dai, Weizhen Wang

TL;DR
This paper generalizes Simpson's Paradox to any number of factors, showing that adding more variables can repeatedly reverse the perceived effect of a factor, challenging assumptions about data and inference accuracy.
Contribution
It introduces the concept of n-factor Simpson's Paradox, providing a formal definition and geometric construction demonstrating its existence for any number of factors.
Findings
Existence of probability distributions exhibiting n-factor Simpson's Paradox for any n
Construction of explicit examples for n=3
Reveals complexity of statistical inference with multiple confounders
Abstract
Simpson's Paradox is a well-known phenomenon in statistical science, where the relationship between the response variable and a certain explanatory factor of interest reverses when an additional factor is considered. This paper explores the extension of Simpson's Paradox to any given number of factors, referred to as the -factor Simpson's Paradox. We first provide a rigorous definition of the -factor Simpson's Paradox, then demonstrate the existence of a probability distribution through a geometric construction. Specifically, we show that for any positive integer , it is possible to construct a probability distribution in which the conclusion about the effect of on reverses each time an additional factor is introduced for . A detailed example for illustrates the construction. Our results highlight that, contrary to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistics Education and Methodologies · Advanced Statistical Methods and Models · Benford’s Law and Fraud Detection
