Test for the statistical significance of a treatment effect in the presence of hidden sub-populations
Bikram Karmakar, Kumaresh Dhara, Kushal Kumar Dey, Analabha, Basu, Anil Ghosh

TL;DR
This paper introduces a method to improve the testing of treatment effects by identifying and adjusting for hidden sub-populations within the data, thereby reducing misleading results caused by population heterogeneity.
Contribution
The paper proposes a novel approach combining clustering and data transformation to account for hidden sub-populations in treatment effect testing.
Findings
Improved accuracy of treatment effect tests on simulated data.
Effective identification of hidden sub-populations using clustering.
Enhanced test reliability on real-world datasets.
Abstract
For testing the statistical significance of a treatment effect, we usually compare between two parts of a population, one is exposed to the treatment, and the other is not exposed to it. Standard parametric and nonparametric two-sample tests are often used for this comparison. But direct applications of these tests can yield misleading results, especially when the population has some hidden sub-populations, and the impact of this sub-population difference on the study variables dominates the treatment effect. This problem becomes more evident if these subpopulations have widely different proportions of representatives in the samples taken from these two parts, which are often referred to as the treatment group and the control group. In this article, we make an attempt to overcome this problem. Our propose methods use suitable clustering algorithms to find the hidden sub-populations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Statistical Methods and Inference · Bayesian Methods and Mixture Models
